-
-
Notifications
You must be signed in to change notification settings - Fork 11k
ENH: Improve performance for np.result_type #28710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There are 2 failing tests due to a |
It seems the |
It was not a false positive after all, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the change to LookupSpecial
, this looks good. If that one was important for the speed-up, I would suggest to put the test where the call is made rather than change it for everything.
@@ -59,6 +60,11 @@ PyArray_LookupSpecial( | ||
return 0; | ||
} | ||
|
||
/* We do not need to check for special attributes on array descriptor types */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a good idea - PyArray_LookupSpecial
is used also to look for __array_ufunc__
and __array_function__
and I don't think those paths should be made slower for result_type
.
p.s. Separately, is this actually used with result_type
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove this one, the speed gain is not major. This path is being used if result_type
as called with dtype
arguments (not with exact ndarray arguments), I think because the np._result_type
is wrapped by array_function_from_c_func_and_dispatcher
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be OK to exclude dtypes in the dispatcher, but probably better in a separate PR.
PyArray_Descr **all_descriptors = (PyArray_Descr **)(&all_DTypes[narrs+ndtypes]); | ||
|
||
/* Copy all dtypes into a single array defining non-value-based behaviour */ | ||
for (npy_intp i=0; i < ndtypes; i++) { | ||
all_DTypes[i] = NPY_DTYPE(descrs[i]); | ||
Py_INCREF(all_DTypes[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems safe, since we are holding on to descrs
, but best to check with @seberg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Borrowed references can be unsafe on the free-threaded build. That said, these are references to immortal (?) DType objects. The question mark there is that in principle someone could delete a DType in one thread and then another thread could hold the last reference to it here.
I have no idea if, in practice, we actually allow deleting DTypes or if that's a true goal for us to support.
Another option is to only drop the reference counting on the GIL-enabled build, since borrowed references should always be safe on the GIL-enabled build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning: here descrs
is passed as an argument to PyArray_ResultType
which means the elements of descrs
cannot (or should not!?) be deleted by another thread. Since descrs
objects are not deleted, the type of the descr
should be safe to use.
PyArray_Descr **all_descriptors = (PyArray_Descr **)(&all_DTypes[narrs+ndtypes]); | ||
|
||
/* Copy all dtypes into a single array defining non-value-based behaviour */ | ||
for (npy_intp i=0; i < ndtypes; i++) { | ||
all_DTypes[i] = NPY_DTYPE(descrs[i]); | ||
Py_INCREF(all_DTypes[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning: here descrs
is passed as an argument to PyArray_ResultType
which means the elements of descrs
cannot (or should not!?) be deleted by another thread. Since descrs
objects are not deleted, the type of the descr
should be safe to use.
@@ -1628,14 +1627,10 @@ PyArray_ResultType( | ||
all_descriptors[i_all] = PyArray_DTYPE(arrs[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is not touched by the PR, but is this safe in the free-threading build? The arrs
is passed so is safe, but PyArray_DTYPE
gets a borrowed reference and another thread can replace the array dtype.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a reproducer:
import unittest
from threading import Thread, Barrier
from test.support import threading_helper
import numpy as np
threading_helper.requires_working_threading(module=True)
class ItertoolsThreading(unittest.TestCase):
@threading_helper.reap_threads
def test_result_type(self):
number_of_threads = 6
number_of_iterations = 25
number_of_cycles = 800
barrier = Barrier(number_of_threads)
x = np.array([1., 2., 3., 4.], np.dtype([('f1', [('f1', '<i2')])]))
y = np.array([1, 2,], np.dtype([('f1', [('f1', '<i2')])]))
def work(it):
barrier.wait()
for _ in range(number_of_cycles):
#print(f'{it=} {_}')
x.dtype = np.dtype([('f1', [('f1', '<i2')])]) # single reference
y.dtype = np.dtype([('f1', [('f1', '<i2')])]) # single reference
np.result_type(x, y)
for it in range(number_of_iterations):
print(f'iteration {it}')
worker_threads = []
for ii in range(number_of_threads):
worker_threads.append(
Thread(target=work, args=[ii]))
with threading_helper.start_threads(worker_threads):
pass
barrier.reset()
print('done')
if __name__ == "__main__":
unittest.main()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you file an issue for this? This is kind of a weird thing to do so I’m not very worried but maybe there’s a simple workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and previous reviewers seemed happy, so merging, thanks.
(IIRC one of the worst performance parts result_type
was that it is often called on objects where we have to do a __array_function__
lookup that fails. Hopefully, future Python versions will allow us to speed this up.)
Improve performance of
np.result_type
NPY_ALLOC_WORKSPACE
inarray_result_type
PyArray_ResultType
Fast path for array method lookup on dtypesnp.result_type
Benchmark: