Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

BUG: pd.to_datetime failing to parse with exception error 01-Jun-2025 in sequence with 31-May-2025 #61395

Copy link
Copy link
Closed
@johndrummond

Description

@johndrummond
Issue body actions

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import sys

print(f"Pandas version: {pd.__version__}")
print(f"Python version: {sys.version}")

df = pd.DataFrame({'day': ["31-May-2025","01-Jun-2025","02-Jun-2025"]})
pd.to_datetime(df['day'])

Issue Description

gives
'Pandas version: 2.2.3'
'Python version: 3.11.11 (main, Dec 4 2024, 08:55:07) [GCC 11.4.0]'

ValueError: time data "01-Jun-2025" doesn't match format "%d-%B-%Y", at position 1. You might want to try:
- passing format if your strings have a consistent format;
- passing format='ISO8601' if your strings are all ISO8601 but not necessarily in exactly the same format;
- passing format='mixed', and the format will be inferred for each element individually. You might want to use dayfirst alongside this.
File , line 2
1 df = pd.DataFrame({'day': ["31-May-2025","01-Jun-2025","02-Jun-2025"]})
----> 2 pd.to_datetime(df['day'])
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/pandas/core/tools/datetimes.py:1067, in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache)
1065 result = arg.map(cache_array)
1066 else:
-> 1067 values = convert_listlike(arg._values, format)
1068 result = arg._constructor(values, index=arg.index, name=arg.name)
1069 elif isinstance(arg, (ABCDataFrame, abc.MutableMapping)):
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/pandas/core/tools/datetimes.py:433, in _convert_listlike_datetimes(arg, format, name, utc, unit, errors, dayfirst, yearfirst, exact)
431 # format could be inferred, or user didn't ask for mixed-format parsing.
432 if format is not None and format != "mixed":
--> 433 return _array_strptime_with_fallback(arg, name, utc, format, exact, errors)
435 result, tz_parsed = objects_to_datetime64(
436 arg,
437 dayfirst=dayfirst,
(...)
441 allow_object=True,

Expected Behavior

it parses happily and correctly with no exception
interestingly it's having the transition end of may. start of June. Starting with 01-Jun-2025 works, ending with 31-May-2025 works,
dateparser.parse is happy
I'm guessing it infers a full month from the May when in fact it is a three character abbreviation.

Installed Versions

running in databricks notebook - checked in a separate version of python locally, with pandas 2.2.1
'Pandas version: 2.2.3'
'Python version: 3.11.11 (main, Dec 4 2024, 08:55:07) [GCC 11.4.0]' for the notebook.
pd.show_versions() doesn't return anything

locally
Pandas version: 2.2.1
Python version: 3.12.2 (main, Mar 25 2024, 11:48:28) [Clang 15.0.0 (clang-1500.3.9.4)]

and pd.show_versions() gives.

FileNotFoundError Traceback (most recent call last)
File /Users/J.Drummond/Documents/wip/python/truth_soc_1.py:2
1 # %%
----> 2 pd.show_versions()

File ~/Documents/wip/python/.venv/lib/python3.12/site-packages/pandas/util/_print_versions.py:141, in show_versions(as_json)
104 """
105 Provide useful information, important for bug reports.
106
(...)
138 ...
139 """
140 sys_info = _get_sys_info()
--> 141 deps = _get_dependency_info()
143 if as_json:
144 j = {"system": sys_info, "dependencies": deps}

File ~/Documents/wip/python/.venv/lib/python3.12/site-packages/pandas/util/_print_versions.py:98, in _get_dependency_info()
96 result: dict[str, JSONSerializable] = {}
97 for modname in deps:
---> 98 mod = import_optional_dependency(modname, errors="ignore")
99 result[modname] = get_version(mod) if mod else None
100 return result
...

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.