BUG: fix incorrect bytes to stringdtype coercion #28282

charris · Feb 5, 2025

Backport of #28276.

It turns out test_scalars_string_conversion was testing the old buggy conversion 🙃.

Is it maybe problematic to assume the bytes are UTF-8? Before we were doing something completely nonsensical so we're free to make a choice here. I think the built-in NumPy bytes dtype assumes everything is ASCII, which is maybe less useful than letting people pass in arbitrary UTF-8?

We could also probably do this faster without going through the Python C API but that can be a future pass if anyone notices.

BUG: fix incorrect bytes to stringdtype coercion

c455112

charris added 00 - Bug 08 - Backport Used to tag backport PRs component: numpy.strings String dtypes and functions labels Feb 5, 2025

charris added this to the 2.2.3 release milestone Feb 5, 2025

charris added 08 - Backport Used to tag backport PRs and removed 08 - Backport Used to tag backport PRs labels Feb 5, 2025

charris merged commit 2cc5acf into numpy:maintenance/2.2.x Feb 5, 2025
68 checks passed

charris deleted the backport-28276 branch February 5, 2025 22:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: fix incorrect bytes to stringdtype coercion #28282

BUG: fix incorrect bytes to stringdtype coercion #28282

Uh oh!

charris commented Feb 5, 2025

Uh oh!

Uh oh!

Uh oh!

Search code, repositories, users, issues, pull requests...

Uh oh!

BUG: fix incorrect bytes to stringdtype coercion #28282

BUG: fix incorrect bytes to stringdtype coercion #28282

Uh oh!

Conversation

charris commented Feb 5, 2025

Uh oh!

Uh oh!

Uh oh!