Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

STY: scientific notation or PEP 515 underscores in large numeric literals#19597

Draft
Pierre-Sassoulas wants to merge 7 commits into
astropy:mainastropy/astropy:mainfrom
Pierre-Sassoulas:scientific-notationPierre-Sassoulas/astropy:scientific-notationCopy head branch name to clipboard
Draft

STY: scientific notation or PEP 515 underscores in large numeric literals#19597
Pierre-Sassoulas wants to merge 7 commits into
astropy:mainastropy/astropy:mainfrom
Pierre-Sassoulas:scientific-notationPierre-Sassoulas/astropy:scientific-notationCopy head branch name to clipboard

Conversation

@Pierre-Sassoulas

Copy link
Copy Markdown
Contributor

Description

This pull request is to address readability of number larger than 10e6 or smaller than 10e-6 using mostly PEP 515 underscore separators. Scientific or engineering notation wouldn't make sense in most case I think, but I'm also happy to take feedback or close this if the change isn't wanted.

The motivation is the new bad-number-notation checker in pylint (pylint-dev/pylint#10425), We added astropy to our primer especially because it has a lot of floats, so I'd really like to have your feedback about (i.e is this change wanted or obnoxious) :) The underlying pylint check itself is still in review and will probably be in pylint 4.1.0.

Fixes #

  • By checking this box, the PR author has requested that maintainers do NOT use the "Squash and Merge" button. Maintainers should respect this when possible; however, the final decision is at the discretion of the maintainer that merges the PR.

@github-actions

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see instructions for rebase and squash.
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

@neutrinoceros

Copy link
Copy Markdown
Contributor

Scientific or engineering notation wouldn't make sense in most case I think

These would change the types from int to float, it really doesn't seem safe to do it as a shotgun refactor. PEP 515 underscores on the other hand are both safe and indeed readability, so I'm inclined to merge as is.

@neutrinoceros

Copy link
Copy Markdown
Contributor

I spoke too fast, I didn't realize many of these are floats already. Feel free to rewrite them in scientific notation. Just make sure to preserve exact values. Thanks !

"d": "347.222d",
"hr": "8333.336hr",
"min": "500000.167min",
"s": "30000010.0s",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be reverted

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I used sed and it modified a string 😅 .

Comment on lines +17 to +19
eta0 = -19.9 / 3_600_000.0
xi0 = 9.1 / 3_600_000.0
da0 = -22.9 / 3_600_000.0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this value could be stored in a variable and use scientific noation

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment in the code is an LLM bluff, I know very little about this 4bdea27

T = (equinox.jd - jd1950) / 36525.0
# Eccentricity of the Earth's orbit
ek = math.radians(k) * ((-0.000000126 * T - 0.00004193) * T + 0.01673011)
ek = math.radians(k) * ((-0.000_000_126 * T - 0.00004193) * T + 0.01673011)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are 2 other literals to update on the same line and they could all use scientific notation

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going slow on scientific notation with the assumption that is was "safer" to add PEP515 underscore (less changes), I'm going to be more daring. Thank you for the review, I'm here for the nits ; to get a feeling if the suggestions from the pylint check are good or not !

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the approach in 62a64d2, I selected the "big win" first.

@neutrinoceros neutrinoceros added this to the v8.0.0 milestone Apr 25, 2026
@Pierre-Sassoulas Pierre-Sassoulas changed the title STY: group digits with PEP 515 underscores in large numeric literals STY: scientific notation or PEP 515 underscores in large numeric literals Apr 26, 2026
Replace the bare 3600000.0 conversion factor in _icrs_to_fk5_matrix()
with a module-level _MAS_PER_DEG named constant in scientific notation.
The value (3,600 arcsec/deg * 1,000 mas/arcsec) is documented in a
short side comment.

Same float, more self-documenting at the call sites. Per review
suggestion from neutrinoceros on PR 19597.
…chars

Apply pylint's scientific-notation suggestion to literals where the
scientific form is at least 4 characters shorter than the underscore
alternative. Mostly tiny CODATA / IAU uncertainties (e.g.
0.00000000051e-27 -> 5.1e-37, saving 10 chars) plus a handful of
round-magnitude values where scientific is the obvious win:
4_000_000.0 -> 4e6, 3_396_190.0 -> 3.39619e6, 6378137.0 -> 6.378137e6,
0.000000443 -> 4.43e-7, etc.

For values where scientific saves 0-3 chars vs underscore (Julian
Dates, speed of light, Rydberg constant, day-numbered epochs), the
underscore form preserves more visual structure and will be applied
later instead.

ruff-format collapsed several CODATA(...) calls onto single lines
once the literals were short enough to fit; that's most of the
line-count delta.

All literals parse to identical floats.
Apply pylint bad-number-notation's "unconventional zero literal"
suggestion in two test files: trailing-zero forms (0.00, 0.000) are
rewritten as 0.0. Identical floats; only the textual form changes.
@Pierre-Sassoulas

Copy link
Copy Markdown
Contributor Author

Thanks a lot for the review ! Quick follow-up, I pushed what I think is consensual (big win in length, what you asked for directly, and removal of strange 0s). I have some question for a follow-up PR. (Or more commit in this PR seeing how many persons get pinged when you touch a lot of files.)

Where should the "needs scientific notation" threshold sit? Pylint defaults to 1.0e6. At that level JDs (~2.4e6) and planet radii (~6e6) get flagged. You asked for more scientific notation in #19597 (comment), but it seems it's at ~= 10^5. Happy to lower the threshold. (But this would catch 3600, 86400, etc. which is why it's higher by default).

Would engineering notation make more sense than scientific notation, sometime or all the time ? (299.792458e6 rather than 2.99792458e8). I have a lot of scientific notation changes locally, but I figured I'd better ask to be sure.

Should we check integers ? This PR is floats only; since you can't write an int in scientific notation, the only fix would be PEP 515 underscores (1_000_000). There's a fair number of 1000000-style ints around (test counts, FITS constants…).

There's some precision limitation on some floats with much higher number of significant numbers (16+) than what python can handle. Mostly ERFA reference fixtures. If we want to keep the precision we should switch to decimal.Decimal("...") (changes type, awkward through np.array/Quantity), keep it as is (but it's misleading about the actual precision), or accept the small drift by rewriting and truncating the unhandled precision (3.14159265358979323846 => 3.141592653589793). There's also underflow to zero (1.0e-324) or overflow to math.inf. Is it okay to switch to 0.0 or math.inf in this case ? What would you favor ? For example:

di2, 0.17293718391166087785 * u.rad, atol=1e-12 * u.rad

0.17293718391166087785 has 20 significant digits, more than float can represent exactly, and it could be written as '0.17293718391166088' or 'decimal.Decimal("0.17293718391166087785")' (or left alone).

assert_allclose(read_values, values, rtol=rtol, atol=1.0e-324)

'1.0e-324' underflows to zero could be written 0.0 directly. (The test itself uses string to represent float to pass them to C for precisely this reason).

I checked that the number are the same using this script:
verify_no_drift.py

@neutrinoceros

neutrinoceros commented Apr 26, 2026

Copy link
Copy Markdown
Contributor

Where should the "needs scientific notation" threshold sit? Pylint defaults to 1.0e6. At that level JDs (~2.4e6) and planet radii (~6e6) get flagged. You asked for more scientific notation in #19597 (comment), but it seems it's at ~= 10^5. Happy to lower the threshold.

I don't have a well defined personal threshold but there's a well established precedent you could build from: the 'g' type formatter, which happens to (coincidentally ?) match what you already have in pylint, so I'd say I'm actually very happy with this being the default.

>>> print(f"{100_000:g}")
100000
>>> print(f"{1_000_000:g}")
1e+06

@neutrinoceros

neutrinoceros commented Apr 26, 2026

Copy link
Copy Markdown
Contributor

Should we check integers ?

I thought that was the starting point TBH, and yes I think we should be using PEP 515 for integers whose absolute value exceed $$10^6$$.

accept the small drift by rewriting and truncating the unhandled precision

this seems like the way to go to me.

'1.0e-324' underflows to zero could be written 0.0 directly.

hum, that's odd. Now I wonder why it wasn't written as 0.0 from the get go. I'm not an expert in floating point arithmetics or definitions, but are you positive that this underflows for all possible architectures ? We actively support exotic ones, so that's where I'd look first to check if this means anything.

As you noticed, touching every part of astropy at once indeed pings a lot of people. We usually do these piecemeal, one namespace per PR, though maybe another way to go about it is to start your PRs as draft and only undraft them once I've actually completed the review cycle, so most subcribers would only get pinged once the PR is ready (and, most likely, merged).

@Pierre-Sassoulas Pierre-Sassoulas marked this pull request as draft April 26, 2026 12:38
@Pierre-Sassoulas

Copy link
Copy Markdown
Contributor Author

but are you positive that this underflows for all possible architectures ?

I'm not an expert myself but think there's officially a forced 53 bit precision since 2022 (python3.11) python/cpython#31790 / https://bugs.python.org/issue46917 (For now, platforms with float larger than 64-bit but without HAVE_PY_SET_53BIT_PRECISION are still supported (but don't get "short float repr").)

Following pylint's bad-number-notation (C0329) check with the same
$\geq$ 1e6 threshold as Python's 'g' format specifier, normalize
literals such as 2456293.25 to 2.45629325e6 and 299792458.0 to
2.99792458e8. Mantissas are kept at the shortest length that
round-trips to the same float64 value, so no numerical drift.
@neutrinoceros

Copy link
Copy Markdown
Contributor

Excellent, we actually don't support older versions.

Following pylint's bad-float-precision (C0330) check, normalize float
literals carrying more digits than IEEE 754 binary64 can represent
(e.g. ERFA reference fixtures with 17-19 significant digits) to their
shortest round-tripping repr — Python's default for repr(float). The
parsed float64 value is unchanged on every architecture astropy runs
on, so test results are bit-identical.
Following pylint's bad-number-notation check with --suggest-int-underscore=y,
add visual digit grouping to integer literals $\geq$ 1e6: groups of three
for decimal (e.g. 2_446_354), groups of four for hex (e.g. 0xFFFF_FFFF).
PEP 515 underscores are purely cosmetic — int values are byte-identical.
@neutrinoceros neutrinoceros added the Extra CI Run cron CI as part of PR label Apr 26, 2026
Comment thread astropy/coordinates/builtin_frames/icrs_fk5_transforms.py
Comment thread astropy/coordinates/earth.py Outdated


OMEGA_EARTH = (1.002_737_811_911_354_48 * u.cycle / u.day).to(
OMEGA_EARTH = (1.0027378119113546 * u.cycle / u.day).to(

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to fix this in pylint

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OMEGA_EARTH = (1.0027378119113546 * u.cycle / u.day).to(
OMEGA_EARTH = (1.002_737_811_911_354_6 * u.cycle / u.day).to(

Comment on lines 437 to +467
@@ -460,7 +462,9 @@ def test_gcrs_hadec():
gcrs = GCRS(usph, obstime="J2000") # broadcast with times below

# check array times sure N-d arrays work
times = Time(np.linspace(2456293.25, 2456657.25, 51) * u.day, format="jd")[:, None]
times = Time(np.linspace(2.45629325e6, 2.45665725e6, 51) * u.day, format="jd")[
:, None
]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting change makes me want to revert this.

values = np.array(
[1.01e200, 3.14e307, 1.777e308, -np.inf, 0.0, 4.94e-324, 1.024e308]
)
values = np.array([1.01e200, 3.14e307, 1.777e308, -np.inf, 0.0, 5e-324, 1.024e308])

@Pierre-Sassoulas Pierre-Sassoulas Apr 26, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5e-324 is the smallest possible absolute value representable with IEEE754 float 64, so maybe it should be 0.0 too here, but we're already checking zero before.

@taldcroft

Copy link
Copy Markdown
Member

Overall I think this PR needs to be done with more nuance and respecting the perspectives of the original authors.

    obs_p = CartesianRepresentation(
-        5724535.74068625, -1311071.58985697, -2492738.93017009, u.m
+        5.72453574068625e6, -1.31107158985697e6, -2.49273893017009e6, u.m
    )

In particular, there are many places like above where I find the original easier to read and I'm not keen on nearly all the changes in the time subpackage. People wrote the original code because that was what they found the most readable. This is not set in stone but a bulk refactor is not necessarily a good approach.

@neutrinoceros

neutrinoceros commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

People wrote the original code because that was what they found the most readable.

PEP 515 wasn't an option until Python 2.7 and 3.5 were both dropped (astropy 4.2, November 2020), and even then, it has probably be underused for lack of awareness. Scientific notation on the other hand was always available, and our average contributor is very likely to know about it, of course.

@Pierre-Sassoulas

Copy link
Copy Markdown
Contributor Author

Some commits here are more consensual :

  • 4bdea27 (asked by neutrinoceros)
  • 62a64d2 (big tersity win)
  • 16d229b (Shorten the floats that were too big to be taken into account as is by a python float 64, the alternative to change to Decimal is a lot of work)

Those commits are opinionated or not very refined, I can drop them or open other PR to discuss them in particular (added them here to reduce noise):

  • 89e54d2. (normalize the zero, they were probably this way to denote the significant numbers though)
  • ff0dd07 (I used scientific notation everywhere for this, but there's 3 possibility, scientific, engineering, underscore or a mix of both, I started with underscore which are more consensual)
  • 48f448a (Used underscore in int bigger than 10e6, only underscore are possible here)

@taldcroft

Copy link
Copy Markdown
Member

I'm happy with the underscores, but I'm 👎 on the blanket changes to scientific notation.

@taldcroft

Copy link
Copy Markdown
Member

With regards to 62a64d2, I'm not sure that is uncontroversial. The reason it was written in the original verbose way is so that the uncertainty has the same exponent as the value. That way the reader can get an immediate visual idea of the relative precision of the constant. I'm not a maintainer of constants, but IMO the original is preferable.

"k_B", "Boltzmann constant", 1.3806488e-23, "J / (K)", 0.0000013e-23, system="si"

@pllim

pllim commented Apr 29, 2026

Copy link
Copy Markdown
Member

tl;dr -- here are my drive-by comments:

  • If we were to proceed, I would prefer we break this into smaller PRs by subpackage.
  • I think we copied constant values from the publlishers. To change the style would just make future updates have more churn.
  • There is no need to touch extern. It is bundled.

Thanks, all!

@astrofrog astrofrog modified the milestones: v8.0.0, v8.1.0 May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.