intermine_source
Module containing IntermineSource class.
IntermineSource class handles loading data stored in Intermine templates.
Intermine is an open source biological data warehouse developed by the University of Cambridge http://intermine.org/ . The IntermineSource launches a pod that can access all templates defined under a specified service. Please see Intermine's tutorials for a detailed overview of there python API: https://github.com/intermine/intermine-ws-python-docs .
Classes
IntermineSource
class IntermineSource( *args: Any, data_splitter: Optional[DatasetSplitter] = None, seed: Optional[int] = None, modifiers: Optional[Dict[str, DataPathModifiers]] = None, ignore_cols: Optional[Union[str, Sequence[str]]] = None, **kwargs: Any,):
Data Source for loading data from Intermine templates.
Intermine is an open source biological data warehouse developed by the University of Cambridge http://intermine.org/ . The IntermineSource launches a pod that can access all templates defined under a specified service. Please see Intermine's tutorials for a detailed overview of their python API: https://github.com/intermine/intermine-ws-python-docs.
info
You must pip install intermine
to use this data source.
Ancestors
Methods
def get_data( self, table_name: Optional[str] = None, **kwargs: Any,) ‑> Optional[pandas.core.frame.DataFrame]:
Loads and returns data from Intermine template.
Arguments
table_name
: Table name for multi table data sources. This comes from the DataStructure.
Returns A DataFrame-type object which contains the data.
def get_values( self, col_names: List[str], table_name: Optional[str] = None, **kwargs: Any,) ‑> Dict[str, Iterable[Any]]:
Get distinct values from list of columns.
Arguments
col_names
: The list of the columns whose distinct values should be returned.table_name
: The name of the table to which the column exists. Required for multi-table databases.
Returns The distinct values of the requested column as a mapping from col name to a series of distinct values.
Variables
multi_table : bool
- Attribute to specify whether the datasource is multi table.
table_names : List[str]
- The names of the tables accessible from this data source.