-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: safer bincount casting #28355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: safer bincount casting #28355
Conversation
The CI failures for the new test case I added are real; |
Briefly discussed at triage meeting this morning--Sebastian suggested forcing a cast to |
dca9fb4
to
8530c81
Compare
* Fixes numpy#28354 * Force usage of `npy_intp` type in `np.bincount()` and avoid unsafe casting errors with i.e., `npy_uint64`. This is similar to our behavior with indexing.
* `arr_bincount()` now only uses unsafe casting for integer input types, and the number of casting operations has been reduced for the code path used in above PR. * a test for non-crashing behavior with non-contiguous `bincount()` input has been added.
8530c81
to
f7046ed
Compare
Ok, I added a guard for that.
I couldn't replicate the need for this because what I did originally was two casting operations instead of one--the
I couldn't find any evidence of a hard crash with that even when I intentionally made things worse with regards to C contiguity, though it may require i.e., |
* The actual C function signature is `PyArray_Size(PyObject *op)`, and the compiler will issue a warning if you follow the current C API docs that specify `PyArrayObject*`, so clean this up. I noticed this while working on numpygh-28355, but it was also mentioned as an aside 7 years ago in numpygh-10977. [skip azp] [skip cirrus] [skip actions]
* The actual C function signature is `PyArray_Size(PyObject *op)`, and the compiler will issue a warning if you follow the current C API docs that specify `PyArrayObject*`, so clean this up. I noticed this while working on gh-28355, but it was also mentioned as an aside 7 years ago in gh-10977. [skip azp] [skip cirrus] [skip actions]
flags = NPY_ARRAY_WRITEABLE | NPY_ARRAY_ALIGNED | NPY_ARRAY_C_CONTIGUOUS; | ||
if (PyArray_Size((PyObject *)list) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecation makes this pretty messy, almost wondering if we should just finalize it...
In either case, I think the PyArray_Size()
here ends up as an imprecise PyArray_Check()
here (as it misses an empty uint8
array for example)?
I am not quite sure that not setting force-cast in the non-array case won't finalize the deprecation (in small parts).
Since both paths use PyArray_ContiguousFromAny
, I think that is fine. Although, it may be that we are not relaxing allowing uint64
for an array-like here i.e. something like:
class a:
def __array__(self):
return np.ones(10, dtype=np.uint64)
@@ -119,6 +119,7 @@ arr_bincount(PyObject *NPY_UNUSED(self), PyObject *const *args, | ||
npy_intp *numbers, *ians, len, mx, mn, ans_size; | ||
npy_intp minlength = 0; | ||
npy_intp i; | ||
int flags; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would move this down into the if, but OK.
I think we can put this in, if we change things to a PyArray_Check
below, unless I am missing something. It seems fine, even if it might not be 100% for array-likes.
I would lean towards not backporting. It has been around forever, and the reason it came up was just cupy testing against NumPy as far as I am aware.
* Based on reviewer feedback, narrow the scope of the `flags` variable in `arr_bincount()`. * Based on reviewer feedback, add an array-like test for the `uint64` casting issue, which indeed fails before and passes after adding a similar shim to the array-like code path in `arr_bincount()`. * Based on reviewer feedback, switch the patching from `PyArray_Size` to `PyArray_Check` in a few places.
@seberg I think I've addressed most of the comments in the latest commit (see commit message). Covering the array-like case does add a bit of code duplication now I suppose. I didn't follow the empty Note that the test for array-like did fail before/pass after the additional coercion shim just added (I also noticed that our error message for passing in a raw array-like class, rather than instance, to I don't think I followed the deprecation-related concern--I thought I was just coercing for integer type inputs and it was the non-integer cases that were deprecated. It is a bit confusing though, because the deprecation message itself seems simple while the commented discussion a bit higher up seems to obfuscate things... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, duplicating this seems indeed like the easiest path.
FWIW, I think this has been around forever and the issue was just a cupy test, I think. So I don't care much for backporting, but I also don't think there is a problem with that.. |
Thanks Tyler and Sebastian! 🙏 |
* BUG: safer bincount casting * Fixes numpy#28354 * Force usage of `npy_intp` type in `np.bincount()` and avoid unsafe casting errors with i.e., `npy_uint64`. This is similar to our behavior with indexing. * MAINT, BUG: PR 28355 revisions * `arr_bincount()` now only uses unsafe casting for integer input types, and the number of casting operations has been reduced for the code path used in above PR. * a test for non-crashing behavior with non-contiguous `bincount()` input has been added. * MAINT, BUG: PR 28355 revisions * Based on reviewer feedback, narrow the scope of the `flags` variable in `arr_bincount()`. * Based on reviewer feedback, add an array-like test for the `uint64` casting issue, which indeed fails before and passes after adding a similar shim to the array-like code path in `arr_bincount()`. * Based on reviewer feedback, switch the patching from `PyArray_Size` to `PyArray_Check` in a few places. * Update numpy/_core/src/multiarray/compiled_base.c --------- Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
As It looks like versions older than that are not seeing much activity. So didn't go further. Please let me know whether that would be worthwhile |
Fixes BUG:
np.bincount
castinguint64
toint64
#28354Allow some uint types to be cast a bit more gracefully for
np.bincount()
.