Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ENH: Improve performance for np.result_type #28710

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 1 addition & 6 deletions 7 numpy/_core/src/multiarray/convert_datatype.c
Original file line number Diff line number Diff line change
Expand Up @@ -1597,13 +1597,12 @@ PyArray_ResultType(
return NULL;
}

PyArray_DTypeMeta **all_DTypes = (PyArray_DTypeMeta **)workspace;
PyArray_DTypeMeta **all_DTypes = (PyArray_DTypeMeta **)workspace; // borrowed references
PyArray_Descr **all_descriptors = (PyArray_Descr **)(&all_DTypes[narrs+ndtypes]);

/* Copy all dtypes into a single array defining non-value-based behaviour */
for (npy_intp i=0; i < ndtypes; i++) {
all_DTypes[i] = NPY_DTYPE(descrs[i]);
Py_INCREF(all_DTypes[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems safe, since we are holding on to descrs, but best to check with @seberg

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Borrowed references can be unsafe on the free-threaded build. That said, these are references to immortal (?) DType objects. The question mark there is that in principle someone could delete a DType in one thread and then another thread could hold the last reference to it here.

I have no idea if, in practice, we actually allow deleting DTypes or if that's a true goal for us to support.

Another option is to only drop the reference counting on the GIL-enabled build, since borrowed references should always be safe on the GIL-enabled build.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reasoning: here descrs is passed as an argument to PyArray_ResultType which means the elements of descrs cannot (or should not!?) be deleted by another thread. Since descrs objects are not deleted, the type of the descr should be safe to use.

all_descriptors[i] = descrs[i];
}

Expand All @@ -1628,14 +1627,10 @@ PyArray_ResultType(
all_descriptors[i_all] = PyArray_DTYPE(arrs[i]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is not touched by the PR, but is this safe in the free-threading build? The arrs is passed so is safe, but PyArray_DTYPE gets a borrowed reference and another thread can replace the array dtype.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a reproducer:

import unittest
from threading import Thread, Barrier
from test.support import threading_helper
import numpy as np

threading_helper.requires_working_threading(module=True)

class ItertoolsThreading(unittest.TestCase):

    
    @threading_helper.reap_threads
    def test_result_type(self):
        number_of_threads = 6
        number_of_iterations = 25
        number_of_cycles = 800

        barrier = Barrier(number_of_threads)
        
        x = np.array([1., 2., 3., 4.], np.dtype([('f1', [('f1', '<i2')])]))
        y = np.array([1, 2,], np.dtype([('f1', [('f1', '<i2')])]))
        
        def work(it):
            barrier.wait()
            for _ in range(number_of_cycles):
                #print(f'{it=} {_}')
                x.dtype = np.dtype([('f1', [('f1', '<i2')])]) # single reference
                y.dtype = np.dtype([('f1', [('f1', '<i2')])]) # single reference
                np.result_type(x, y)
               
               

        for it in range(number_of_iterations):
            print(f'iteration {it}')
            worker_threads = []
            for ii in range(number_of_threads):
                worker_threads.append(
                    Thread(target=work, args=[ii]))

            with threading_helper.start_threads(worker_threads):
                pass

            barrier.reset()

        print('done')

if __name__ == "__main__":
    unittest.main()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you file an issue for this? This is kind of a weird thing to do so I’m not very worried but maybe there’s a simple workaround.

all_DTypes[i_all] = NPY_DTYPE(all_descriptors[i_all]);
}
Py_INCREF(all_DTypes[i_all]);
}

PyArray_DTypeMeta *common_dtype = PyArray_PromoteDTypeSequence(
narrs+ndtypes, all_DTypes);
for (npy_intp i=0; i < narrs+ndtypes; i++) {
Py_DECREF(all_DTypes[i]);
}
if (common_dtype == NULL) {
goto error;
}
Expand Down
18 changes: 11 additions & 7 deletions 18 numpy/_core/src/multiarray/multiarraymodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -3583,24 +3583,28 @@ static PyObject *
array_result_type(PyObject *NPY_UNUSED(dummy), PyObject *const *args, Py_ssize_t len)
{
npy_intp i, narr = 0, ndtypes = 0;
PyArrayObject **arr = NULL;
PyArray_Descr **dtypes = NULL;
PyObject *ret = NULL;

if (len == 0) {
PyErr_SetString(PyExc_ValueError,
"at least one array or dtype is required");
goto finish;
return NULL;
}

arr = PyArray_malloc(2 * len * sizeof(void *));
NPY_ALLOC_WORKSPACE(arr, PyArrayObject *, 2 * 3, 2 * len);
if (arr == NULL) {
return PyErr_NoMemory();
return NULL;
}
dtypes = (PyArray_Descr**)&arr[len];
PyArray_Descr **dtypes = (PyArray_Descr**)&arr[len];

PyObject *previous_obj = NULL;

for (i = 0; i < len; ++i) {
PyObject *obj = args[i];
if (obj == previous_obj) {
continue;
eendebakpt marked this conversation as resolved.
Show resolved Hide resolved
}

if (PyArray_Check(obj)) {
Py_INCREF(obj);
arr[narr] = (PyArrayObject *)obj;
Expand Down Expand Up @@ -3636,7 +3640,7 @@ array_result_type(PyObject *NPY_UNUSED(dummy), PyObject *const *args, Py_ssize_t
for (i = 0; i < ndtypes; ++i) {
Py_DECREF(dtypes[i]);
}
PyArray_free(arr);
npy_free_workspace(arr);
return ret;
}

Expand Down
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.