Skip to main content

csv_source

Module containing CSVSource class.

CSVSource class handles loading of CSV data.

Classes

CSVSource

class CSVSource(    *args: Any,    data_splitter: Optional[DatasetSplitter] = None,    seed: Optional[int] = None,    modifiers: Optional[Dict[str, DataPathModifiers]] = None,    ignore_cols: Optional[Union[str, Sequence[str]]] = None,    **kwargs: Any,):

Data source for loading csv files.

Arguments

  • path: The path or URL to the csv file.
  • `read_csv_kwargs**: Additional arguments to be passed to pandas.read_csv`.

Methods


def get_data(self, **kwargs: Any)> pandas.core.frame.DataFrame:

Loads and returns data from CSV dataset.

Returns A DataFrame-type object which contains the data.

def get_values(self, col_names: List[str], **kwargs: Any)> Dict[str, Iterable[Any]]:

Get distinct values from columns in CSV dataset.

Arguments

  • col_names: The list of the columns whose distinct values should be returned.

Returns The distinct values of the requested column as a mapping from col name to a series of distinct values.

Variables

  • multi_table : bool - Attribute to specify whether the datasource is multi table.