to_association#
- to_association(catalog: HealpixDataset, *, base_catalog_path: str | Path | UPath, catalog_name: str | None = None, primary_catalog_dir: str | Path | UPath | None = None, primary_column_association: str | None = None, primary_id_column: str | None = None, join_catalog_dir: str | Path | UPath | None = None, join_column_association: str | None = None, join_to_primary_id_column: str | None = None, join_id_column: str | None = None, separation_column: str | None = None, overwrite: bool = False, addl_hats_properties: dict | None = None, **kwargs)[source]#
Writes a crossmatching product to disk, in HATS association table format. The output catalog comprises partition parquet files and respective metadata.
The column name arguments should reflect the column names on the corresponding primary and join OBJECT catalogs, so that the association table can be used to perform equijoins on the two sides and recreate the crossmatch.
- Parameters:
- catalogHealpixDataset
A catalog to export
- base_catalog_pathpath-like
Location where catalog is saved to
- catalog_namestr or None, default None
The name of the output catalog
- primary_catalog_dirpath-like or None, default None
The path to the primary catalog
- primary_column_associationstr or None, default None
The column in the association catalog that matches the primary (left) side of join
- primary_id_columnstr or None, default none
The id column in the primary catalog
- join_catalog_dirpath-like or None, default None
The path to the join catalog
- join_column_associationstr or None, default None
The column in the association catalog that matches the joining (right) side of join
- join_id_columnstr or None, default None
The id column in the join catalog
- separation_columnstr or None, default None
The name of the crossmatch separation column
- overwritebool, default False
If True existing catalog is overwritten
- **kwargs
Arguments to pass to the parquet write operations
Notes
To configure the appropriate column names, consider two tables that do not share an identifier space (e.g. two surveys), and the way you could go about joining them together with an association table:
TABLE GAIA_SOURCE { DESIGNATION <primary key> } TABLE SDSS { SDSS_ID <primary key> }
And a SQL query to join them with as association table would look like:
SELECT g.DESIGNATION as gaia_id, s.SDSS_ID as sdss_id FROM GAIA_SOURCE g JOIN association_table a ON a.primary_id_column = g.DESIGNATION JOIN SDSS s ON a.join_id_column = s.SDSS_ID
Consider instead an object table, joining to a detection table:
TABLE OBJECT { ID <primary key> } TABLE DETECTION { DETECTION_ID <primary key> OBJECT_ID <foreign key> }
And a SQL query to join them would look like:
SELECT o.ID as object_id, d.DETECTION_ID as detection_id FROM OBJECT o JOIN DETECTION d ON o.ID = d.OBJECT_ID
This is important, as there are three different column names, but really only two meaningful identifiers. For this example, the arguments for this method would be as follows:
primary_id_column = "ID", join_to_primary_id_column = "OBJECT_ID", join_id_column = "DETECTION_ID",