Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

catch pyarrow.lib.ArrowTypeError for augment_schema #2129

Copy link
Copy link
Open
@j-blackwell

Description

@j-blackwell
Issue body actions

Is your feature request related to a problem? Please describe.
The problematic field name is not returned in the error for the augment_schema function like it is elsewhere. This can then show up in places like load_table_from_dataframe since augment_schema is called.

pyarrow.lib.ArrowTypeError: Expected bytes, got a 'float' object

  File "...", line 57, in load_table_bq
    job = client.load_table_from_dataframe(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../.venv/lib/python3.12/site-packages/google/cloud/bigquery/client.py", line 2781, in load_table_from_dataframe
    new_job_config.schema = _pandas_helpers.dataframe_to_bq_schema(
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../.venv/lib/python3.12/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 491, in dataframe_to_bq_schema
    bq_schema_out = augment_schema(dataframe, bq_schema_out)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../.venv/lib/python3.12/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 520, in augment_schema
    arrow_table = pyarrow.array(dataframe.reset_index()[field.name])
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyarrow/array.pxi", line 360, in pyarrow.lib.array
  File "pyarrow/array.pxi", line 87, in pyarrow.lib._ndarray_to_array
  File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status

Describe the solution you'd like
Fix could be similar to #1836

def augment_schema(dataframe, current_bq_schema):
    ...
    for field in current_bq_schema:
        if field.field_type is not None:
            augmented_schema.append(field)
            continue
        try:
            arrow_table = pyarrow.array(dataframe.reset_index()[field.name])
        except ArrowTypeError:
            msg = f"""Error converting Pandas column with name: "{field.name}" and datatype: "{field.dtype}" to an appropriate pyarrow datatype: ..."""
        _LOGGER.error(msg)
        raise ArrowTypeError(msg)

Happy to submit a PR if this would be approved?

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the googleapis/python-bigquery API.Issues related to the googleapis/python-bigquery API.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Morty Proxy This is a proxified and sanitized view of the page, visit original site.