Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

BUG: index overlap copy #26958

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 15, 2025
Merged

BUG: index overlap copy #26958

merged 3 commits into from
May 15, 2025

Conversation

tylerjereddy
Copy link
Contributor

It does feel a bit complicated, though the tests passed locally. I was hoping to avoid usage of both solve_may_share_memory and index_has_memory_overlap in succession, since the latter contains the former, but then I think I'd need to perform both defensive copies. I wonder if it might be useful for index_has_memory_overlap to have a different return code for index vs. "other" overlap so that the defensive copy might be performed selectively to the type of overlap. I'm guessing my ref counting is probably missing some stuff too.

@vxst
Copy link
Contributor

vxst commented Jul 19, 2024

I'm not sure whether I understand the source correctly, but shouldn't we use PyArray_MapIterArrayCopyIfOverlap? Is it possible to call this API at array_assign_item stage as it seems solving the same problem or is the API reserved for ufunc only?

@tylerjereddy
Copy link
Contributor Author

Sebastian asked me to use index_has_memory_overlapin the original PR related to this: #26558 (comment)

But I don't really have strong views on it, I just tend to do what Sebastian suggests :)

@vxst
Copy link
Contributor

vxst commented Jul 19, 2024

I'd follow Seb's suggestion. Just copy.deepcopy module seems to be a little overkill for me.

@seberg
Copy link
Member

seberg commented Jul 19, 2024

I'm not sure whether I understand the source correctly, but shouldn't we use PyArray_MapIterArrayCopyIfOverlap

That function isn't used in this file at all (or its no-copy version). It does some additional setup, and is used by ufunc.at only (used to be public, but it's bad/slow enough that I don't think that it made much sense, IMO. Just hope PyMC will handle that.)

I just tend to do what Sebastian suggests :)

Of course I prefer if you don't ;), but I guess it works. FWIW, it may make a lot of sense to move these checks into PyArray_MapIterNew! It helps that the second version of PyArray_MapIterArrayCopyIfOverlap is deleted. (I haven't squinted at the code long enough to be sure that works.)

Just copy.deepcopy module seems to be a little overkill for me.

True, we know that the inputs are arrays, so we can copy them as arrays.

@mattip mattip requested a review from seberg August 26, 2024 14:09
@tylerjereddy
Copy link
Contributor Author

@seberg I revised to do an array copy instead of the deepcopy shenanigans; things seemed "ok" locally at least.

I didn't do the extra step of trying to migrate to PyArray_MapIterNew--let me know if that's a strong preference and I'll try that next.

@tylerjereddy
Copy link
Contributor Author

@seberg do you want me to make the extra changes here or is this "ok?"

@@ -1963,6 +1963,9 @@ array_assign_subscript(PyArrayObject *self, PyObject *ind, PyObject *op)
if (tmp_arr && solve_may_share_memory(self, tmp_arr, 1) != 0) {
Py_SETREF(tmp_arr, (PyArrayObject *)PyArray_NewCopy(tmp_arr, NPY_ANYORDER));
}
if (index_has_memory_overlap(self, index_type, indices, index_num, (PyObject *)tmp_arr) == 1) {
Py_SETREF(indices->object, PyArray_NewCopy((PyArrayObject *)indices->object, NPY_KEEPORDER));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't know how I missed both pings. This doesn't work, indices-> is a C array of index information.

If you add a test like:

a[2:][None, a[:-2]] = 3

that should just segfault. The other place where the index_has_memoverlap is used actually copies the result (self here and then uses the Writebackifcopy mechanism to make that work).

Now, it may be however, that this is sufficient, because it could be that this only is relevant to the fast path (because otherwise the iterator setup can protect us).


Either way, not sure that you want to look at it again, which is fine... Just realized that I never responded.

* Fixes numpy#26844

* Perform a defensive copy of array indices when the indices themselves
may have memory overlap with `self`.
* Revise the fix for index memory overlap to use
a regular array copy on the indices rather than
a deepcopy, which was a bit more heavyweight than
needed.
* Add regression test for segfault from:
https://github.com/numpy/numpy/pull/26958/files#r1854589178

* Add a fix for the above bug after discussing in person
with Sebastian.
@tylerjereddy
Copy link
Contributor Author

I added an extra regression test and bug fix for the scenario described at #26958 (comment), after discussing this PR with @seberg in person at the summit today.

@seberg
Copy link
Member

seberg commented May 15, 2025

Thanks, Tyler! Let's give this a shot. I could imagine that it could be better to copy self sometimes (with writeback if copy), but that depends on the inputs -- and this shouldn't actually happen much anyway.

@seberg seberg merged commit b3bad31 into numpy:main May 15, 2025
72 checks passed
@tylerjereddy tylerjereddy deleted the treddy_issue_26844 branch May 15, 2025 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: memory overlap in some tricky indexing scenarios
3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.