"z" format specifier is treated differently in unicode and bytes

Hello up there. I've hit a discrepancy in how z flags is handled by % in unicode and bytes:

for unicode % rejects it as "unsupported format character" according to original discussion in string formatting: normalize negative zero #90153 (= BPO-45995),
hower for bytes % fully handles "z":

kirr@deca:~$ python3
Python 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> '%zf' % 1                                       <--   unicode
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: unsupported format character 'z' (0x7a) at index 1

>>> b'%zf' % 1                                      <--   bytes
b'1.000000'

>>> b'%zf' % 0.0                                    <--   +0 -> 0
b'0.000000'

>>> b'%zf' % -0.0                                   <--   -0 -> 0
b'0.000000'

>>> b'%f' % -0.0                                    <--   -0 -> -0 if run without 'z'
b'-0.000000'

In other words there is inconsistency in how 'z' is handled by '%' for unicode and bytes, and there is also inconsistency in how 'z' was supposed to be handled by .format and not handled by '%' as originally discussed on BPO-45995.

'z' handling was implemented in #30049 and indeed there I see b'%z' being fully handled:

b0b836b20cb5#diff-f6d440aad34e1c4535c0d898c0197a95490766c745991caace6f64b5dd1ece51

but u'%z' being only partly handled internally without corresponding frontend parsing that bytes has:

b0b836b20cb5#diff-34c966e7876d6f8bf801dd51896327e4f68bba02cddb95fbf3963f0b2e39c38a

In my view the fix should be either a) to add '%z' handling to unicode, or b) to remove '%z' handling from bytes.

Thanks beforehand,
Kirill

CPython versions tested on: 3.11.2
Operating system and architecture: Debian GNU/Linux 12 on AMD64

/cc @belm0, @mdickinson

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

"z" format specifier is treated differently in unicode and bytes #104018

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

Uh oh!

"z" format specifier is treated differently in unicode and bytes #104018

Description

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions