utils
Utility functions concerning data.
Classes
DatabaseConnection
class DatabaseConnection( con: Union[str, sqlalchemy.engine.base.Engine], db_schema: Optional[str] = None, query: Optional[str] = None, table_names: Optional[List[str]] = None,):
Encapsulates database connection information for a BaseSource
.
If a query
is provided or if table_name
only has one table, the database will be
queried for the data, after which the database connection will be closed and the
resulting DataFrame will be used and stored in the BaseSource
.
Arguments
con
: A database URI provided as a string or a SQLAlchemy Engine. This should include the database name, user, password, host, port, etc.db_schema
: The database schema to use. If not provided, the default schema will be used.query
: The SQL query to be executed as a string.table_names
: Name(s) of SQL table(s) in database.
Attributes
multi_table
: Whether or not the database connection is for multiple tables.
Raises
DatabaseMissingTableError
: Ifschema
(or the default schema if not provided) does not contain any tables or any of the specified tables can't be found in the schema.DatabaseSchemaNotFoundError
: Ifschema
is provided but can't be found in the database.DatabaseModificationError
: Ifquery
is provided and contains an 'INTO' clause.ValueError
: If bothquery
andtable_names
are provided.
danger
If you are creating a multi-table Pod, ensure that the connection you provide only
has access to the schemas and tables you wish to share and that this access has
suitably restricted permissions i.e. SELECT
only.
table_names
limits the Pod schema to only those tables you specify but it does
not prevent a Modeller from accessing other tables in the schema or indeed other
tables in other schemas by guessing their names.
If only a single table is provided or a query is provided to combine multiple tables into one table, the Modeller will have no access to the database.