per_pixel_statistics

per_pixel_statistics#

Catalog.per_pixel_statistics(use_default_columns: bool = True, exclude_hats_columns: bool = True, exclude_columns: list[str] | None = None, include_columns: list[str] | None = None, include_stats: list[str] | None = None, multi_index=False, include_pixels: list[HealpixPixel] | None = None) DataFrame#

Read footer statistics in parquet metadata, and report on min/max values for for each data partition.

Parameters:
use_default_columnsbool, default True

Should we use only the columns that are loaded by default (will be set in the metadata by the catalog provider). Defaults to True.

exclude_hats_columnsbool, default True

Exclude HATS spatial and partitioning fields from the statistics. Defaults to True.

exclude_columnslist[str] or None, default None

Additional columns to exclude from the statistics.

include_columnslist[str] or None, default None

If specified, only return statistics for the column names provided. Defaults to None, and returns all non-hats columns.

include_statslist[str] or None, default None

If specified, only return the kinds of values from list (min_value, max_value, null_count, row_count). Defaults to None, and returns all values.

multi_indexbool, default False

Should the returned frame be created with a multi-index, first on pixel, then on column name? Default is False, and instead indexes on pixel, with separate columns per-data-column and stat value combination.

include_pixelslist[HealpixPixel] or None, default None

If specified, only return statistics for the pixels indicated. Defaults to none, and returns all pixels.

Returns:
pd.Dataframe

Dataframe with granular per-pixel statistics