Skip to main content

utils

Utility functions concerning data.

Classes

DatabaseConnection

class DatabaseConnection(    con: Union[str, sqlalchemy.engine.base.Engine],    db_schema: Optional[str] = None,    query: Optional[str] = None,    table_names: Optional[List[str]] = None,):

Encapsulates database connection information for a BaseSource.

If a query is provided or if table_name only has one table, the database will be queried for the data, after which the database connection will be closed and the resulting DataFrame will be used and stored in the BaseSource.

Arguments

  • con: A database URI provided as a string or a SQLAlchemy Engine. This should include the database name, user, password, host, port, etc.
  • db_schema: The database schema to use. If not provided, the default schema will be used.
  • query: The SQL query to be executed as a string.
  • table_names: Name(s) of SQL table(s) in database.

Attributes

  • multi_table: Whether or not the database connection is for multiple tables.

Raises

  • DatabaseMissingTableError: If schema (or the default schema if not provided) does not contain any tables or any of the specified tables can't be found in the schema.
  • DatabaseSchemaNotFoundError: If schema is provided but can't be found in the database.
  • DatabaseModificationError: If query is provided and contains an 'INTO' clause.
  • ValueError: If both query and table_names are provided.
danger

If you are creating a multi-table Pod, ensure that the connection you provide only has access to the schemas and tables you wish to share and that this access has suitably restricted permissions i.e. SELECT only.

table_names limits the Pod schema to only those tables you specify but it does not prevent a Modeller from accessing other tables in the schema or indeed other tables in other schemas by guessing their names.

If only a single table is provided or a query is provided to combine multiple tables into one table, the Modeller will have no access to the database.