Parquet
- class sources.Parquet
-
Source reading data from Parquet files.
- sources.Parquet.add_file(file)
-
Add data to the source.
- sources.Parquet.create(file=None, *, time_column, key_column, schema=None, subsort_column=None, grouping_name=None, time_unit=None)
-
Create a Parquet source.
- Parameters:
-
file (Optional[str], default:
None
)The url or path of the Parquet file to add. Paths should be relative to the
current working directory or absolute. URLs may describe local file paths or
object-store locations.
time_column (str)
The name of the column containing the time.
key_column (str)
The name of the column containing the key.
schema (Optional[Schema], default:
None
)The schema to use. If not provided, it will be inferred from the input.
subsort_column (Optional[str], default:
None
)The name of the column containing the subsort.
If not provided, the subsort will be assigned by the system.
grouping_name (Optional[str], default:
None
)The name of the group associated with each key.
This is used to ensure implicit joins are only performed between data grouped
by the same entity.
time_unit (Optional[TimeUnit], default:
None
)The unit of the time column. One of
ns
,us
,ms
, ors
.If not specified (and not specified in the data), nanosecond will be assumed.