Closed
Description
Bug report
Bug description:
I noticed that chaining struct.unpack()
and struct.pack()
for IEEE 754 Half Precision floats (e
) is non-invertible for nan
. E.g.:
import struct
original_bytes = b'\xff\xff'
unpacked_float = struct.unpack('e', original_bytes)[0] # nan
repacked_bytes = struct.pack('e', unpacked_float) # b'\x00\xfe' != b'\xff\xff'
IEEE nan
s aren't unique, so this isn't that surprising... However I found it curious that the same behavior is not exhibited for float
(f
) or double
(d
) format, where every original bit pattern I tested could be recovered from the unpacked nan
object.
Is this by design?
Here's a quick pytest
script that tests over a broad range of nan
/inf
/-inf
cases for each encoding format.
# /// script
# requires-python = ">=3.11"
# dependencies = ["pytest"]
# ///
import struct
import pytest
# Floating Point Encodings Based on IEEE 754 per https://en.wikipedia.org/wiki/IEEE_754#Basic_and_interchange_formats
# binary 16 (half precision) - 1 bit sign, 5 bit exponent, 11 bit significand
# binary 32 (single precision) - 1 bit sign, 8 bit exponent, 23 bit significand
# binary 64 (double precision) - 1 bit sign, 11 bit exponent, 52 bit significand
MAX_TEST_CASES = 100000 # limit number of bit patterns being sampled so we aren't waiting too long
@pytest.mark.parametrize(["precision_format", "precision", "exponent_bits"], [("f", 32, 8), ("d", 64, 11), ("e", 16, 5)])
@pytest.mark.parametrize("sign_bit", [0, 1])
@pytest.mark.parametrize("endianness", ["little", "big"])
def test_struct_floats(precision_format: str, precision: int, exponent_bits: int, sign_bit: int, endianness: str):
significand_bits = precision - exponent_bits - 1
n_tests = min(MAX_TEST_CASES, 2**significand_bits)
significand_patterns = [significand_bits * "0", significand_bits * "1"] + [
bin(i + 1)[2:] for i in range(1, 2**significand_bits, 2**significand_bits // n_tests)
]
for i in range(n_tests):
binary = str(sign_bit) + "1" * exponent_bits + significand_patterns[i]
if endianness == "big":
format = ">" + precision_format
elif endianness == "little":
format = "<" + precision_format
else:
raise NotImplementedError()
test_bytes = int(binary, base=2).to_bytes(precision // 8, endianness)
unpacked = struct.unpack(format, test_bytes)
assert len(unpacked) == 1
repacked = struct.pack(format, unpacked[0])
assert (
repacked == test_bytes
), f"struct pack/unpack was not invertible for format {format} with raw value: {test_bytes} -> unpacks to {unpacked[0]}, repacks to {repacked}"
if __name__ == "__main__":
pytest.main([__file__])
CPython versions tested on:
3.13, 3.11, 3.12
Operating systems tested on:
Linux, Windows
Linked PRs
Metadata
Metadata
Assignees
Labels
C modules in the Modules dirC modules in the Modules dirAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error