-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG,ENH: Fix internal __array_wrap__
for direct calls
#27807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Since adding `return_scalar` as am argument, the array-wrap implementations were slightly wrong when that argument was actually passed and the function called directly. NumPy itself rarely (or never) did so for our builtin types now so that was not a problem within NumPy. Further, the scalar version was completely broken, converting to scalar even when such a conversion was impossible. As explained in the code. For array subclasses we NEVER want to convert to scalar by default. The subclass must make that choice explicitly. (There are plenty of tests for this behavior.)
2a172da
to
af1fcaa
Compare
@mhvk maybe you can have a quick glance over. I think it should be straight forward. This should be back-portable, but not sure I think it is important enough to worry about it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, and my comments are mostly about the phrasing of comments.
Only slightly larger item is whether there are already explicit tests for the case that self
is a subclass and the item to be wrapped is the same subclass.
numpy/_core/src/multiarray/methods.c
Outdated
@@ -892,24 +892,24 @@ static PyObject * | ||
array_wraparray(PyArrayObject *self, PyObject *args) | ||
{ | ||
PyArrayObject *arr; | ||
PyObject *obj; | ||
PyObject *context_ignored = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to do a slight double take of whether this was a flag for whether the context would be ignored. How about just
PyObject *UNUSED = NULL; # for the context argument
numpy/_core/src/multiarray/methods.c
Outdated
|
||
/* | ||
* Subclasses must implement `__array_wrap__` with all the arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this a bit hard to parse. How about
* If the array does not have our own type, we always wrap, even
* if `return_scalar` is set, so that we preserve the (presumably
* important) subclass information.
* If the type matches, we honor `return_scalar`, passing through
* PyArray_Return to convert any array with ndim=0 to scalar.
Though perhaps even better to just move the comment up, replacing the new comment on l.897?
@@ -2035,29 +2036,33 @@ gentype_getarray(PyObject *scalar, PyObject *args) | ||
return ret; | ||
} | ||
|
||
static char doc_sc_wraparray[] = "sc.__array_wrap__(obj) return scalar from array"; | ||
static char doc_sc_wraparray[] = "Array wrap to implementation for scalar types"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete "to" - and maybe still spell out the dunder method for easier grepping, i.e., "__array_wrap__ implementation for scalar types"
|
||
/* | ||
* Array wrap for scalars, returning a scalar again preferentially. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
* __array_wrap__ for scalars, returning a scalar if possible.
PyArrayObject *arr; | ||
PyObject *context_ignored = NULL; | ||
/* return_scalar should be passed, but it was a scalar so prefer scalar */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* return_scalar should be passed, but we're scalar, so we prefer scalars by default */
numpy/_core/src/multiarray/methods.c
Outdated
PyArray_FLAGS(arr), (PyObject *)self, obj); | ||
PyArray_FLAGS(arr), (PyObject *)self, (PyObject *)arr); | ||
} | ||
else if (return_scalar) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolute nitpick, but for similarity with the scalar case, perhaps
Py_INCREF(arr)
if (return_scalar) {
...
Since the previous clause had a return, we don't need an else if
strictly.
Though arguably less clear, so feel free to ignore (really, just because I thought I saw a difference in ref counting when reviewing).
# * NumPy returns scalars, if `return_scalar` is passed as True to allow | ||
# manual calls to `arr.__array_wrap__` to do the right thing. | ||
|
||
class MyArr(np.ndarray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should one also test a subclass that does not override __array_wrap__
? Or is that done elsewhere?
arr = arr.view(MyArr) | ||
|
||
arr0d = np.array(3, dtype=np.int8) | ||
# Third argument not passed, None, or True "decays" to scalar. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy & paste error from the scalar test? For an array, only True
causes decay to scalar. Maybe just
# With third argument True, ndarray allows "decay" to scalar.
if subclass: | ||
arr = arr.view(MyArr) | ||
|
||
arr0d = np.array(3, dtype=np.int8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if arr0d
is a subclass too? As written, I think that causes decay to a scalar (if arr
is a subclass).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but I don't think it should ever be a subclass! I.e. that would defeat the purpose of wrapping (converting everything to plain arrays)?
(Technically, subclasses are accepted here. But in practice, I think we must assume that the original subclass is meaningless, because we must return our subclass.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, I misunderstood. While it is probably true that this shouldn't happen, I agree that I should anticipate it and fix the branch (return of scalar must only happens if self
is a base-class array)
Thanks for the good spot, pushed a fix and hopefully made the comments/docs a bit clearer. It is slightly annoying because we have to very strictly check for the 0-d case. (I am ignoring that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One very small comment left, but feel free to ignore.
"can only be called with ndarray object"); | ||
return NULL; | ||
|
||
if (return_scalar && Py_TYPE(self) == &PyArray_Type && PyArray_NDIM(arr) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the NDIM
part can in principle be removed - PyArray_Return
does that already. Maybe slightly better? (If going that route, adjust the comment to # Return scalar if NDIM==0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is what I had... But then my test fails, because I somewhat prefer returning an ndarray
rather than a subclass, and PyArray_Return
won't do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is what I had... But then my test fails, because I somewhat prefer returning an
ndarray
rather than a subclass, andPyArray_Return
won't do that.
OK, makes sense now!
I think I'll put this in then, thanks for the review @mhvk and spotting the subclass issue!. |
Since adding
return_scalar
as am argument, the array-wrap implementations were slightly wrong when that argument was actually passed and the function called directly.NumPy itself rarely (or never) did so for our builtin types now so that was not a problem within NumPy.
Further, the scalar version was completely broken, converting to scalar even when such a conversion was impossible.
As explained in the code. For array subclasses we NEVER want to convert to scalar by default. The subclass must make that choice explicitly.
(There are plenty of tests for this behavior.)
Closes gh-27783