Description
Feature or enhancement
Proposal:
Compiled-code modules that implement time-consuming operations that don’t require manipulating Python objects, are supposed to call PyErr_CheckSignals
frequently throughout each such operation, so that if the user interrupts the operation with control-C, it is cancelled promptly.
In the normal case where no signals are pending, PyErr_CheckSignals
is cheap; however, callers must hold the GIL, and compiled-code modules that implement time-consuming operations are also supposed to release the GIL during each such operation. The overhead of reclaiming the GIL in order to call PyErr_CheckSignals
,
and then releasing it again, sufficiently often for reasonable user responsiveness, can be substantial.
If my understanding of the thread-state rules is correct, PyErr_CheckSignals only needs the GIL if it has work to do. Checking whether there is a pending signal, or a pending request to run the cycle collector, requires only a couple of atomic loads. Therefore, I propose that we should restructure PyErr_CheckSignals
and its close relatives (_PyErr_CheckSignals
and _PyErr_CheckSignalsTstate
) so that all the “do we have anything to do” checks are done in a batch before anything that needs the GIL. If any of them are true, then the function should acquire the GIL itself, repeat the check (because another thread could have stolen the event while we were waiting for the GIL), and then actually do the work, thus enabling callers to not hold the GIL.
I have already prepared and tested an implementation of this change and will submit a PR shortly.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response