DOC: Clarify (potentially misleading) nbytes docstring #28943

zvun · May 11, 2025

The documentation for numpy.ndarray.nbytes has the potentially misleading description that it's the "total bytes consumed by the elements of the array", but the nbytes for a view doesn't reflect the memory consumption of its elements, but rather what that consumption would've been if it were a copy. This has been mentioned before in #22925, but the issue was closed before this was clarified. I have included an additional example in the docstring that demonstrates this.

seberg

Thanks, adding a note seems good, but that much is much too complicated for the extra information.

seberg · May 12, 2025

numpy/_core/_add_newdocs.py


    Notes
    -----
+    If the array is a view, this shows how much memory it *would* use
+    if it were copied into a separate array.
    Does not include memory consumed by non-element attributes of the
    array object.


Maybe we ca add that it also doesn't include memory indirectly held by the elements.
(I.e. if you store Python objects or the new StringDType)

seberg · May 12, 2025

numpy/_core/_add_newdocs.py

+    >>> arr_1.nbytes
+    800000
+    >>> arr_2.nbytes
+    2400


This introduces way too much complexity for very little gain. If anything at all, just do some slicing like arr[::2] or so.

seberg · May 12, 2025

numpy/_core/_add_newdocs.py

@@ -2698,6 +2701,17 @@
    >>> np.prod(x.shape) * x.itemsize
    480

+    >>> import numpy as np


No need I think, but a sentence on why the next thing comes would help.

ngoldbaum · May 12, 2025

Another wrinkle is that it's only the memory used by the array buffer. For object arrays or StringDType arrays, it's an underestimate.

zvun · May 12, 2025

Thank you, I have edited the docstring based on the suggestions.

mattip · May 12, 2025

Maybe it is enough to use qualifiers like "approximately" and "at least"/"at most" rather than try to describe all the ways the number is wrong. Then point to a documentation page like https://numpy.org/devdocs/dev/internals.code-explanations.html#memory-model, and maybe add nuanced qualifications there instead of in the docstring.

zvun · May 14, 2025

I guess one future-proof way could be to describe how nbytes is calculated, and then mention some examples for different dtypes. With a quick search, it seems to be the product of the array's dimensions multiplied by the item size. For object elements the latter is probably the size of the pointers, not sure how it would be for StringDType though. What do you think? @mattip @seberg

ngoldbaum · May 20, 2025

What do you think?

That sounds good. Something like "This is the memory used by the main array buffer and does not account for any memory used for array metadata or for data stored outside of the array buffer. For example, nbytes, is a lower limit for the object and StringDType types because these types can store data outside the main array buffer.

DOC: Clarify nbytes docstring

114a38f

github-actions bot added the 04 - Documentation label May 11, 2025

InessaPawson added this to NumPy first-time contributor PRs May 11, 2025

InessaPawson moved this to Awaiting a code review in NumPy first-time contributor PRs May 11, 2025

Remove whitespace

3ffc6b5

seberg reviewed May 12, 2025

View reviewed changes

zvun added 3 commits May 12, 2025 16:17

Apply PR suggestions

f828aa7

Typo

7ebe43b

Rephrase

06b23da

melissawm moved this from Awaiting a code review to Pending authors' response in NumPy first-time contributor PRs May 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC: Clarify (potentially misleading) nbytes docstring #28943

DOC: Clarify (potentially misleading) nbytes docstring #28943

Uh oh!

zvun commented May 11, 2025

Uh oh!

seberg left a comment

Uh oh!

seberg May 12, 2025

Uh oh!

seberg May 12, 2025

Uh oh!

seberg May 12, 2025

Uh oh!

ngoldbaum commented May 12, 2025

Uh oh!

zvun commented May 12, 2025

Uh oh!

mattip commented May 12, 2025

Uh oh!

zvun commented May 14, 2025

Uh oh!

ngoldbaum commented May 20, 2025

Uh oh!

Uh oh!

Search code, repositories, users, issues, pull requests...

Uh oh!

DOC: Clarify (potentially misleading) nbytes docstring #28943

Are you sure you want to change the base?

DOC: Clarify (potentially misleading) nbytes docstring #28943

Uh oh!

Conversation

zvun commented May 11, 2025

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

seberg May 12, 2025

Choose a reason for hiding this comment

Uh oh!

seberg May 12, 2025

Choose a reason for hiding this comment

Uh oh!

seberg May 12, 2025

Choose a reason for hiding this comment

Uh oh!

ngoldbaum commented May 12, 2025

Uh oh!

zvun commented May 12, 2025

Uh oh!

mattip commented May 12, 2025

Uh oh!

zvun commented May 14, 2025

Uh oh!

ngoldbaum commented May 20, 2025

Uh oh!

Uh oh!