np.array(..., dtype=np.float32).astype(np.float16) doesn't handle denorms properly

The smallest denorm value that can be held in a np.float16 is 2**-24. The value 2**-25 is halfway between 0 and 2**-24, but is rounded down to 0 when converted to a np.float16 because of the rule to round to the nearest even-lsb value in the case of ties. However, any value that is slightly above 2**-25 should be rounded up to 2**-24. The following interpreter sequence shows such a case that is properly handled by the float64->float16 conversion, but not by the float32->float16 conversion (see the last line below).

$ python3
Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.__version__
'1.14.3'
>>> '%.40f' % np.array([2**-24]).astype(np.float64)[0]
'0.0000000596046447753906250000000000000000'
>>> '%.40f' % np.array([2**-24]).astype(np.float32)[0]
'0.0000000596046447753906250000000000000000'
>>> '%.40f' % np.array([2**-24]).astype(np.float16)[0]
'0.0000000596046447753906250000000000000000'
>>> '%.40f' % np.array([2**-25]).astype(np.float64)[0]
'0.0000000298023223876953125000000000000000'
>>> '%.40f' % np.array([2**-25]).astype(np.float32)[0]
'0.0000000298023223876953125000000000000000'
>>> '%.40f' % np.array([2**-25]).astype(np.float16)[0]    # OK: expected flush to 0
'0.0000000000000000000000000000000000000000'
>>> '%.40f' % np.array([2**-25 + 2**-38]).astype(np.float64)[0]
'0.0000000298059603665024042129516601562500'
>>> '%.40f' % np.array([2**-25 + 2**-38]).astype(np.float32)[0]
'0.0000000298059603665024042129516601562500'
>>> '%.40f' % np.array([2**-25 + 2**-38]).astype(np.float64).astype(np.float16)[0]    # OK: 2**-24
'0.0000000596046447753906250000000000000000'
>>> '%.40f' % np.array([2**-25 + 2**-38]).astype(np.float32).astype(np.float16)[0]    # not OK: should be 2**-24
'0.0000000000000000000000000000000000000000'
>>>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

np.array(..., dtype=np.float32).astype(np.float16) doesn't handle denorms properly #12721

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

Uh oh!

np.array(..., dtype=np.float32).astype(np.float16) doesn't handle denorms properly #12721

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions