Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

BUG: fancy indexing copy #26558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 3, 2024
Merged

BUG: fancy indexing copy #26558

merged 3 commits into from
Jul 3, 2024

Conversation

tylerjereddy
Copy link
Contributor

  • Fixes Fancy indexing fails on a[idx] = a #26542

  • For these very simple advanced indexing cases, if the result and self arrays share the same data pointer, use a copy of result to assign values to self, otherwise the outcome of the fancy indexing suffers from mutation of result before the assignments are complete.

@@ -1460,6 +1460,9 @@ mapiter_trivial_@name@(
char *base_ptr, *ind_ptr, *result_ptr;
npy_intp self_stride, ind_stride, result_stride;
npy_intp fancy_dim = PyArray_DIM(self, 0);
if (PyArray_DATA(self) == PyArray_DATA(result)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the may share memory functionality here. Also would be good to do this at the call site (mapping.c), I think and make sure to decref the array again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the "may share memory" stuff carry the risk of a more substantial slow down? I think that's what I was asking about in the issue, at some point there's a tradeoff I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, may share memory does check memory bounds first and gives up at some point if things get complicated.
I.e. the simple cases are fast, some slow cases may end up with a defensive copy.

E.g. ufuncs end up using solve_may_share_memory with max_work=1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I revised the code to use solve_may_share_memory one level up in the control flow, in mapping.c.

make sure to decref the array again

I couldn't replicate the need for that, which array am I increasing the refcount for? Don't we proceed through a deep copy here?

I think this only covers a very well-behaved/specialized fancy indexing codepath based on the code comments, though that's also true of the regression test, 1-D contiguous integer data, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I think it nicer, although I also think it should be even further up maybe.

Some notes:

  • We ignore overlap in the indices here. We actually have a path index_has_memory_overlap which is used indirectly by the code from ufunc.at and does all the checks necessary including for index arrays.
  • I see no reason why you cannot move the check to before the fast-path if? In fact tmp_arr gets prepared just above, which seems like a perfect opportunity to do that check.
    (I suppose in theory, overlap is possible even if a fresh array is created with FromAny, so could do it after the if rather than before.)

One thing that might be nice, is that I am pretty sure we already guard for the other copy paths against this, e.g.:

arr = np.arange(100)
arr[10:] = arr[:-10]

would be nice to check we have tests and add one together with the others if not (so we can proof we always make a defensive copy in indexing paths).

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think we should move it up even further and since it seems simple (the utility exists) also guard against index overlap.

@@ -1999,6 +1999,10 @@ array_assign_subscript(PyArrayObject *self, PyObject *ind, PyObject *op)
goto fail;
}

if (solve_may_share_memory(self, tmp_arr, 1) == 1) {
tmp_arr = (PyArrayObject *)PyArray_NewCopy(tmp_arr, NPY_ANYORDER);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tmp_arr = (PyArrayObject *)PyArray_NewCopy(tmp_arr, NPY_ANYORDER);
Py_SETREF(tmp_arr, PyArray_NewCopy(tmp_arr, NPY_ANYORDER));

You would leak the old tmp_arr

@@ -1460,6 +1460,9 @@ mapiter_trivial_@name@(
char *base_ptr, *ind_ptr, *result_ptr;
npy_intp self_stride, ind_stride, result_stride;
npy_intp fancy_dim = PyArray_DIM(self, 0);
if (PyArray_DATA(self) == PyArray_DATA(result)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I think it nicer, although I also think it should be even further up maybe.

Some notes:

  • We ignore overlap in the indices here. We actually have a path index_has_memory_overlap which is used indirectly by the code from ufunc.at and does all the checks necessary including for index arrays.
  • I see no reason why you cannot move the check to before the fast-path if? In fact tmp_arr gets prepared just above, which seems like a perfect opportunity to do that check.
    (I suppose in theory, overlap is possible even if a fresh array is created with FromAny, so could do it after the if rather than before.)

One thing that might be nice, is that I am pretty sure we already guard for the other copy paths against this, e.g.:

arr = np.arange(100)
arr[10:] = arr[:-10]

would be nice to check we have tests and add one together with the others if not (so we can proof we always make a defensive copy in indexing paths).

* Fixes numpygh-26542

* For these very simple advanced indexing cases,
if the `result` and `self` arrays share the same
data pointers, use a copy of `result` to assign
values to `self`, otherwise the outcome of the fancy
indexing suffers from mutation of `result` before
the assignments are complete.
* Avoid leaking the old `tmp_arr`, based on reviewer feedback.

* Add a test for a similar 2D case that fails, then hoist
`solve_may_share_memory()` check farther up in the control
flow such that the test passes.

* Add a reviewer-requested test for index overlap.
@tylerjereddy
Copy link
Contributor Author

@seberg I added a test for a more "general" case (2d version of same problem basically) to justify hoisting the memory overlap check farther up in the control flow as you requested (it did indeed fail before/pass after). Hoisting did require an additional null check.

I also added the test case you mentioned, which seems to pass just fine. I'm not sure I fully grasped your comment on overlapping indices, if that's more sophisticated than the test case you cited. Perhaps you can provide/describe a sample test case I need to think about if I'm still missing something. Your additional test does have overlapping indices IIUC, but presumably I won't get off the hook that easily.

@mattip mattip requested a review from seberg July 3, 2024 06:36
@@ -1960,6 +1960,10 @@ array_assign_subscript(PyArrayObject *self, PyObject *ind, PyObject *op)
tmp_arr = (PyArrayObject *)op;
}

if (tmp_arr && solve_may_share_memory(self, tmp_arr, 1) == 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (tmp_arr && solve_may_share_memory(self, tmp_arr, 1) == 1) {
if (tmp_arr && solve_may_share_memory(self, tmp_arr, 1) != 0) {

Since < 0 means "failed to solve", which would mean we should do a defensive copy.

We will be missing if the index has overlap, such as something awkward like:

a = np.arange(10)
a[2:][a[:-2]] = 3

but let's just accept that here and make a new issue for that.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with the suggestion applied, if you are happy @tylerjereddy, thanks.

* The usage of `solve_may_share_memory` in the above
PR wasn't quite right since it ignored the case of failing
to solve the overlap problem. This has been revised according
to reviewer feedback.

Co-authored-by: Sebastian Berg <sebastianb@nvidia.com>
@tylerjereddy
Copy link
Contributor Author

@seberg Ok, I revised accordingly and added you as co-author. The cross-listed issue explains the additional case that needs looking at in a follow-up.

@seberg
Copy link
Member

seberg commented Jul 3, 2024

Thanks for being so thorough, could probably just have done it myself... let's get it in :).

@seberg seberg merged commit 0acdad6 into numpy:main Jul 3, 2024
65 of 66 checks passed
@charris charris added the 09 - Backport-Candidate PRs tagged should be backported label Jul 3, 2024
@tylerjereddy tylerjereddy deleted the treddy_issue_26542 branch July 3, 2024 18:06
DimitriPapadopoulos pushed a commit to DimitriPapadopoulos/numpy that referenced this pull request Jul 4, 2024
* BUG: fancy indexing copy

* Fixes numpygh-26542

* For these very simple advanced indexing cases,
if the `result` and `self` arrays share the same
data pointers, use a copy of `result` to assign
values to `self`, otherwise the outcome of the fancy
indexing suffers from mutation of `result` before
the assignments are complete.

* MAINT, BUG: PR 26558 revisions

* Avoid leaking the old `tmp_arr`, based on reviewer feedback.

* Add a test for a similar 2D case that fails, then hoist
`solve_may_share_memory()` check farther up in the control
flow such that the test passes.

* Add a reviewer-requested test for index overlap.

* BUG: PR 26558 revisions

* The usage of `solve_may_share_memory` in the above
PR wasn't quite right since it ignored the case of failing
to solve the overlap problem. This has been revised according
to reviewer feedback.

Co-authored-by: Sebastian Berg <sebastianb@nvidia.com>

---------

Co-authored-by: Sebastian Berg <sebastianb@nvidia.com>
charris pushed a commit to charris/numpy that referenced this pull request Jul 5, 2024
* BUG: fancy indexing copy

* Fixes numpygh-26542

* For these very simple advanced indexing cases,
if the `result` and `self` arrays share the same
data pointers, use a copy of `result` to assign
values to `self`, otherwise the outcome of the fancy
indexing suffers from mutation of `result` before
the assignments are complete.

* MAINT, BUG: PR 26558 revisions

* Avoid leaking the old `tmp_arr`, based on reviewer feedback.

* Add a test for a similar 2D case that fails, then hoist
`solve_may_share_memory()` check farther up in the control
flow such that the test passes.

* Add a reviewer-requested test for index overlap.

* BUG: PR 26558 revisions

* The usage of `solve_may_share_memory` in the above
PR wasn't quite right since it ignored the case of failing
to solve the overlap problem. This has been revised according
to reviewer feedback.

Co-authored-by: Sebastian Berg <sebastianb@nvidia.com>

---------

Co-authored-by: Sebastian Berg <sebastianb@nvidia.com>
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Jul 5, 2024
eagunn pushed a commit to eagunn/numpy that referenced this pull request Jul 9, 2024
* BUG: fancy indexing copy

* Fixes numpygh-26542

* For these very simple advanced indexing cases,
if the `result` and `self` arrays share the same
data pointers, use a copy of `result` to assign
values to `self`, otherwise the outcome of the fancy
indexing suffers from mutation of `result` before
the assignments are complete.

* MAINT, BUG: PR 26558 revisions

* Avoid leaking the old `tmp_arr`, based on reviewer feedback.

* Add a test for a similar 2D case that fails, then hoist
`solve_may_share_memory()` check farther up in the control
flow such that the test passes.

* Add a reviewer-requested test for index overlap.

* BUG: PR 26558 revisions

* The usage of `solve_may_share_memory` in the above
PR wasn't quite right since it ignored the case of failing
to solve the overlap problem. This has been revised according
to reviewer feedback.

Co-authored-by: Sebastian Berg <sebastianb@nvidia.com>

---------

Co-authored-by: Sebastian Berg <sebastianb@nvidia.com>
ArvidJB pushed a commit to ArvidJB/numpy that referenced this pull request Nov 1, 2024
* BUG: fancy indexing copy

* Fixes numpygh-26542

* For these very simple advanced indexing cases,
if the `result` and `self` arrays share the same
data pointers, use a copy of `result` to assign
values to `self`, otherwise the outcome of the fancy
indexing suffers from mutation of `result` before
the assignments are complete.

* MAINT, BUG: PR 26558 revisions

* Avoid leaking the old `tmp_arr`, based on reviewer feedback.

* Add a test for a similar 2D case that fails, then hoist
`solve_may_share_memory()` check farther up in the control
flow such that the test passes.

* Add a reviewer-requested test for index overlap.

* BUG: PR 26558 revisions

* The usage of `solve_may_share_memory` in the above
PR wasn't quite right since it ignored the case of failing
to solve the overlap problem. This has been revised according
to reviewer feedback.

Co-authored-by: Sebastian Berg <sebastianb@nvidia.com>

---------

Co-authored-by: Sebastian Berg <sebastianb@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fancy indexing fails on a[idx] = a
3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.