open_catalog#
- open_catalog(path: str | ~pathlib.Path | ~upath.core.UPath, search_filter: ~lsdb.core.search.abstract_search.AbstractSearch | None = None, columns: list[str] | str | None = None, margin_cache: str | ~pathlib.Path | ~upath.core.UPath | None = None, error_empty_filter: bool = True, filters: list[tuple[str]] | None = None, path_generator: ~typing.Callable[[~upath.core.UPath, ~hats.pixel_math.healpix_pixel.HealpixPixel, dict | None, str], ~upath.core.UPath] = <function pixel_catalog_file>, **kwargs) Catalog[source]#
Open a catalog from a HATS path.
Catalogs exist in collections or stand-alone.
Catalogs in a HATS collection are composed of a main catalog, and margin and index catalogs. LSDB will open exactly ONE main object catalog and at most ONE margin catalog. The collection.properties file specifies which margins and indexes are available, and which margin to use by default:
my_collection_dir/ ├── main_catalog/ ├── margin_catalog/ ├── margin_catalog_2/ ├── index_catalog/ ├── collection.properties
All arguments passed to the open_catalog call are applied to the calls to open the main and margin catalogs.
Typical usage example, where we open a collection with a subset of columns:
lsdb.open_catalog(path='./my_collection_dir', columns=['ra','dec'])
Typical usage example, where we open a collection from a cone search:
lsdb.open_catalog( path='./my_collection_dir', columns=['ra','dec'], search_filter=lsdb.ConeSearch(ra, dec, radius_arcsec), )
Typical usage example, where we open a collection with a non-default margin:
lsdb.open_catalog(path='./my_collection_dir', margin_cache='margin_catalog_2')
Note that this margin still needs to be specified in the all_margins attribute of the collection.properties file.
We can also open each catalog separately, if needed:
lsdb.open_catalog(path='./my_collection_dir/main_catalog')
- Parameters:
- pathpath-like
The path that locates the root of the HATS collection or stand-alone catalog.
- search_filtertype[AbstractSearch] or None, default None
The spatial filter method to be applied.
- columnslist[str] or str or None, default None
The set of columns to filter the catalog on. If None, the catalog’s default columns will be loaded. To load all catalog columns, use columns=”all”.
- margin_cachepath-like or None, default None
The margin for the main catalog, provided as a path.
- error_empty_filterbool, default True
If loading the catalog with a filter results in an empty catalog, throw error.
- filterslist[tuple[str]] or None, default None
Filters to apply when reading parquet files. These may be applied as pyarrow filters or URL parameters.
- path_generatorCallable[[UPath, HealpixPixel, dict | None, str], UPath], optional
The function f(catalog_base_dir, pixel, query_params, npix_suffix) that translates HEALPix into partition data paths. Its arguments are the following:
catalog_base_dir: UPath - path passed to open_catalog/read_hats
pixel: HealpixPixel - pixel to generate path for
query_params: dict | None - dictionary used to generate HTTP query string
npix_suffix: str - “/” for leaf directory, filename suffix like “.parquet” for leaf file
The catalog metadata files need to live where the HATS standard expects them. Defaults to hats.io.pixel_catalog_file.
- **kwargs
Arguments to pass to the pandas parquet file reader
- Returns:
- Catalog
The catalog loaded according to the specified arguments.