pandect package¶

Submodules¶

pandect.pandect module¶

pandect.pandect.expand_path(x)[source]¶: Helper function to expand ~ and environment variables in paths

pandect.pandect.load(source, sep=', ', expand=True, flags=<RegexFlag.IGNORECASE: 2>, table=None)[source]¶

Load dataset into pandas.DataFrame object

Uses file extension as heuristic to determine input format.

Supports: csv, tsv, xlsx, sav, dta (unreliable), sqlite3

sepstr: Separator used by csv
expandtrue: Expand ~ and environment variables in path strings
flagsre.RegexFlag: Regular expression flags for matching file name extensions
tablestr: Name of table to load (needed for some database input sources)

datapandas.DataFrame: DataFrame object
metapyreadstat.metadata_container: Metadata (empty if not provided by data source)

FileNotFoundError IOError

Loading dta files is unreliable (bug in pyreadstat, might segfault)

Incomplete list of metadata:

column_names : list with the names of the columns
column_labels : list with the column labels, if any
column_names_to_labels : dict{column_names: column_labels}
variable_value_labels : dict{variable_names: dict}
variable_to_label : dict{variable_names: label_name}
value_labels : dict{label_name: dict}
variable_measure : nominal, ordinal, scale or unknown

See the pyreadstat web docs for complete spec.

Module contents¶

Top-level package for pandect.