Description
Hello up there. I've hit a discrepancy in how z
flags is handled by %
in unicode and bytes:
- for unicode
%
rejects it as "unsupported format character" according to original discussion in string formatting: normalize negative zero #90153 (= BPO-45995), - hower for bytes
%
fully handles "z":
kirr@deca:~$ python3
Python 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> '%zf' % 1 <-- unicode
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: unsupported format character 'z' (0x7a) at index 1
>>> b'%zf' % 1 <-- bytes
b'1.000000'
>>> b'%zf' % 0.0 <-- +0 -> 0
b'0.000000'
>>> b'%zf' % -0.0 <-- -0 -> 0
b'0.000000'
>>> b'%f' % -0.0 <-- -0 -> -0 if run without 'z'
b'-0.000000'
In other words there is inconsistency in how 'z' is handled by '%' for unicode and bytes, and there is also inconsistency in how 'z' was supposed to be handled by .format
and not handled by '%' as originally discussed on BPO-45995.
'z' handling was implemented in #30049 and indeed there I see b'%z' being fully handled:
b0b836b20cb5#diff-f6d440aad34e1c4535c0d898c0197a95490766c745991caace6f64b5dd1ece51
but u'%z' being only partly handled internally without corresponding frontend parsing that bytes has:
b0b836b20cb5#diff-34c966e7876d6f8bf801dd51896327e4f68bba02cddb95fbf3963f0b2e39c38a
In my view the fix should be either a) to add '%z' handling to unicode, or b) to remove '%z' handling from bytes.
Thanks beforehand,
Kirill
- CPython versions tested on: 3.11.2
- Operating system and architecture: Debian GNU/Linux 12 on AMD64
/cc @belm0, @mdickinson
Linked PRs
- gh-104018: disallow "z" format specifier in %-format of byte strings #104033
- [3.11] gh-104018: disallow "z" format specifier in %-format of byte strings (GH-104033) #104058
- gh-104018: remove unused format "z" handling in string formatfloat() #104107
- [3.11] gh-104018: remove unused format "z" handling in string formatfloat() (GH-104107) #104260