Open
Description
The get_table_names
and get_view_names
methods are supposed to return the bare names (no {schema}.
prefix) of the resources for a single schema/dataset (where schema=None
is the "default schema"). However, the BigQueryDialect
implementation returns:
- bare names for a specific
schema
: if the connection has a default dataset {schema}.
prefixed names for one schema: if the connection doesn't have a default dataset and theschema
arg is a string{schema}.
prefixed names for all schemas: if the connection doesn't have a default dataset and theschema
arg isNone
The bolded behaviors, which arise when the connection doesn't have a default dataset, are incorrect and trigger some edge cases:
- permission errors when access is limited (1) because all datasets are scanned
- duplicate
schema.
prefixes / incorrectNoSuchTableError
s (1, 2) because SQLAlchemy adds the prefix when appropriate alembic revision --autogenerate
wipes and recreates all tablesinclude_schemas=False
(default) should manage a single schema but the BigQueryDialect code would:- scan the current schema and incorrectly prefix the
schema.
resulting in("schema", "schema.table")
tuples that don't match the models represented as("schema", "table")
tuples. This causes alembic to try to delete and recreate all of our models. - scan all other schemas. Alembic doesn't see any models defined for them so tries to delete all tables in other schemas.
- scan the current schema and incorrectly prefix the
include_schemas=True
manages all schemas, but:- alembic looks up all existing schemas and then does the reflection for
schema=None
and all discovered schemas. Since theBigQueryDialect
would return prefixed names for all schemas, it would reflect table tuples like(None, "schema1.table"), (None, "schema2.table"), ("schema1", "schema1.table"), ("schema2", "schema2.table")
. TheNone
records will seem new and emitCREATE TABLE
s while the double prefixed ones later error withNoSuchTableError
when looking up their metadata.
- alembic looks up all existing schemas and then does the reflection for
Environment details
- OS type and version: macOS Sonoma 14.5; M1
- Python version: 3.12.3
- pip version: 24.0
sqlalchemy-bigquery
version: 1.11.0
Steps to reproduce
- Run the script below
- At the breakpoint, notice that
Base.metadata.tables
has tables from all datasets - Notice that the script errors with
NoSuchTableError
even though the table exists
Code example
from os import getenv
from sqlalchemy import create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
dataset_1, dataset_2 = "tmp1", "tmp2" # bq query "create schema tmp1; create schema tmp2;"
project = getenv("GOOGLE_CLOUD_PROJECT")
engine = create_engine(f"bigquery://{project}")
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "user"
__table_args__ = {"schema": dataset_1}
id: Mapped[int] = mapped_column(primary_key=True)
name: Mapped[str]
fullname: Mapped[str]
nickname: Mapped[str | None]
class Org(Base):
__tablename__ = "org"
__table_args__ = {"schema": dataset_2}
id: Mapped[int] = mapped_column(primary_key=True)
name: Mapped[str]
Base.metadata.create_all(engine)
Base.metadata.clear()
Base.metadata.reflect(engine)
# NOTE: Base.metadata.tables *should* be empty (no schema nor a connection default), but is actually populated from all datasets
breakpoint()
# NOTE: This will confusingly raise `sqlalchemy.exc.NoSuchTableError: {schema}.{table}` - it's actually
# looking for `{schema}.{table}` _within_ `schema`.
Base.metadata.reflect(engine, schema=dataset_1)
Stack trace
$ python3 x.py
(Pdb) len(Base.metadata.tables)
15
(Pdb) c
Traceback (most recent call last):
File "/.../x.py", line 51, in <module>
Base.metadata.reflect(engine, schema=dataset_1)
File "/.../.venv/lib/python3.12/site-packages/sqlalchemy/sql/schema.py", line 5843, in reflect
Table(name, self, **reflect_opts)
File "<string>", line 2, in __new__
File "/.../.venv/lib/python3.12/site-packages/sqlalchemy/util/deprecations.py", line 281, in warned
return fn(*args, **kwargs) # type: ignore[no-any-return]
^^^^^^^^^^^^^^^^^^^
File "/.../.venv/lib/python3.12/site-packages/sqlalchemy/sql/schema.py", line 432, in __new__
return cls._new(*args, **kw)
^^^^^^^^^^^^^^^^^^^^^
File "/.../.venv/lib/python3.12/site-packages/sqlalchemy/sql/schema.py", line 486, in _new
with util.safe_reraise():
File "/.../.venv/lib/python3.12/site-packages/sqlalchemy/util/langhelpers.py", line 146, in __exit__
raise exc_value.with_traceback(exc_tb)
File "/.../.venv/lib/python3.12/site-packages/sqlalchemy/sql/schema.py", line 482, in _new
table.__init__(name, metadata, *args, _no_init=False, **kw)
File "/.../.venv/lib/python3.12/site-packages/sqlalchemy/sql/schema.py", line 862, in __init__
self._autoload(
File "/.../.venv/lib/python3.12/site-packages/sqlalchemy/sql/schema.py", line 894, in _autoload
conn_insp.reflect_table(
File "/[...]/.venv/lib/python3.12/site-packages/sqlalchemy/engine/reflection.py", line 1538, in reflect_table
raise exc.NoSuchTableError(table_name)
sqlalchemy.exc.NoSuchTableError: tmp1.user
Metadata
Metadata
Assignees
Labels
Issues related to the googleapis/python-bigquery-sqlalchemy API.Issues related to the googleapis/python-bigquery-sqlalchemy API.