WriteApi.write does not support pandas' nullable integer

Specifications

Client Version: 1.36.1
InfluxDB Version: 2.7.0
Platform: Mac

If you have a dataframe with Pandas' nullable integer as one of the column datatypes, and a row includes a pd.NA value, you get the following traceback:

Traceback (most recent call last):
    write_api.write(
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write_api.py", line 366, in write
    return self._write_batching(bucket, org, record,
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write_api.py", line 469, in _write_batching
    serializer.serialize(chunk_idx),
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write/dataframe_serializer.py", line 270, in serialize
    return list(lp)
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write/dataframe_serializer.py", line 268, in <genexpr>
    lp = (re.sub('^(( |[^ ])* ),([a-zA-Z0-9])(.*)', '\\1\\3\\4', self.f(p))
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write/dataframe_serializer.py", line 269, in <lambda>
    for p in filter(lambda x: _any_not_nan(x, self.field_indexes), _itertuples(chunk)))
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write/dataframe_serializer.py", line 27, in _any_not_nan
    return any(map(lambda x: _not_nan(p[x]), indexes))
  File "pandas/_libs/missing.pyx", line 388, in pandas._libs.missing.NAType.__bool__
TypeError: boolean value of NA is ambiguous

However, if your change your column datatype to a float (which has a native NaN encoding), it works

Code sample to reproduce problem

import pandas as pd

df = pd.DataFrame({"x": [1, pd.NA], "time": [0, 1]}).astype({"x": "Int64"})
with get_client() as client:
    with client.write_api() as write_api:
        write_api.write(BUCKET, record=df, data_frame_measurement_name="test", data_frame_timestamp_column="time")

Expected behavior

I would anticipate that this behaves the same as if it were a float. My current work around is to use floats.

If the code is too complicated to fix/would incur significant slowdown for other users, I think at minimum, raising a cleaner exception would be reasonable.

Actual behavior

I get an exception:

Traceback (most recent call last):
    write_api.write(
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write_api.py", line 366, in write
    return self._write_batching(bucket, org, record,
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write_api.py", line 469, in _write_batching
    serializer.serialize(chunk_idx),
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write/dataframe_serializer.py", line 270, in serialize
    return list(lp)
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write/dataframe_serializer.py", line 268, in <genexpr>
    lp = (re.sub('^(( |[^ ])* ),([a-zA-Z0-9])(.*)', '\\1\\3\\4', self.f(p))
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write/dataframe_serializer.py", line 269, in <lambda>
    for p in filter(lambda x: _any_not_nan(x, self.field_indexes), _itertuples(chunk)))
  File "venv/lib/python3.9/site-packages/influxdb_client/client/write/dataframe_serializer.py", line 27, in _any_not_nan
    return any(map(lambda x: _not_nan(p[x]), indexes))
  File "pandas/_libs/missing.pyx", line 388, in pandas._libs.missing.NAType.__bool__
TypeError: boolean value of NA is ambiguous

Additional info

My knee-jerk reaction is I saw is in client/write/dataframe_serializer.py, there is a function:

def _not_nan(x):
    return x == x

which I think can just be

def _not_nan(x):
    from ...extras import pd
    return pd.isna(x)

However, I saw this block of code:

                if null_columns[index]:
                    key_value = f"""{{
                            '' if {val_format} == '' or type({val_format}) == float and math.isnan({val_format}) else
                            f',{key_format}={{str({val_format}).translate(_ESCAPE_STRING)}}'
                        }}"""

which looks pretty crazy, and I am not sure how the data would look at that point?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WriteApi.write does not support pandas' nullable integer #590

Specifications

Code sample to reproduce problem

Expected behavior

Actual behavior

Additional info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

WriteApi.write does not support pandas' nullable integer #590

Description

Specifications

Code sample to reproduce problem

Expected behavior

Actual behavior

Additional info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions