gh-120321: Make gi_frame_state transitions atomic in FT build#142599
gh-120321: Make gi_frame_state transitions atomic in FT build#142599colesbury merged 18 commits intopython:mainpython/cpython:mainfrom colesbury:gh-120321-gen-atomiccolesbury/cpython:gh-120321-gen-atomicCopy head branch name to clipboard
Conversation
This makes generator frame state transitions atomic in the free threading build, which avoids segfaults when trying to execute a generator from multiple threads concurrently. There are still a few operations that aren't thread-safe and may crash if performed concurrently on the same generator/coroutine: * Accessing gi_yieldfrom/cr_await/ag_await * Accessing gi_frame/cr_frame/ag_frame * Async generator operations
This comment was marked as outdated.
This comment was marked as outdated.
Python/bytecodes.c
Outdated
| FT_ATOMIC_STORE_INT8_RELEASE(gen->gi_frame_state, FRAME_SUSPENDED + oparg); | ||
| #ifdef Py_GIL_DISABLED | ||
| ((_PyThreadStateImpl *)tstate)->gen_last_frame_state = FRAME_SUSPENDED + oparg; | ||
| #endif |
There was a problem hiding this comment.
Would it make sense to factor this out into a helper function? It's duplicated in ceval.c. Also, having helper to update both would reduce the chance that we update one without the other.
There was a problem hiding this comment.
I've refactored it out. I also moved the functions to ceval_macros.h from pycore_genobject.h because:
- they are only used in ceval.c and bytecodes.c
- I want to avoid adding additional includes to
pycore_genobject.hbecause I think it's pulled in by internal headers that some third-party projects tend to include.
Objects/genobject.c
Outdated
|
|
||
| gen->gi_frame_state = FRAME_EXECUTING; | ||
| #if defined(Py_GIL_DISABLED) && defined(Py_DEBUG) | ||
| ((_PyThreadStateImpl *)tstate)->gen_last_frame_state = FRAME_EXECUTING; |
There was a problem hiding this comment.
why is this set only in debug builds?
There was a problem hiding this comment.
I changed this to #if defined(Py_GIL_DISABLED). It's not really needed at all, but it'll help catch bugs in case some ceval code forgets to set gen_last_frame_state before returning.
In other words, when _PyEval_EvalFrame returns, gen_last_frame_state must be either FRAME_SUSPENDED[_YIELD_FROM] or FRAME_CLEARED and this helps differentiate this call to _PyEval_EvalFrame from earlier calls in the same thread.
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
Objects/genobject.c
Outdated
| // Grab the last frame state from the thread state instead of the | ||
| // generator, as it may have changed if another thread resumed this | ||
| // generator. | ||
| int8_t frame_state = ((_PyThreadStateImpl *)tstate)->gen_last_frame_state; |
There was a problem hiding this comment.
How can another thread resume this generator? This thread is executing it, so no other thread can run it.
There was a problem hiding this comment.
This thread is no longer executing the generator once it sets the frame state to suspended in YIELD_VALUE. Between that line in bytecodes.c and here, another thread can immediately start executing the same generator. That isn't true with the GIL because there are no opportunities for reentrancy or GIL release between YIELD_VALUE and the return from _PyEval_Frame. That lack of opportunities for reentrancy is also why it's safe to store it on the thread state.
There was a problem hiding this comment.
OK, I see what this issue is.
What you want is an additional return value, to distinguish between a yield and a return.
I think the code would be clearer, if gen_last_frame_state field were renamed to something like generator_return_kind. Set it to RETURN before the call, and set it to YIELD in YIELD_VALUE.
Instead of testing for FRAME_STATE_SUSPENDED(frame_state) test for generator_return_kind != RETURN
Also add a comment about why it is needed.
There was a problem hiding this comment.
I've renamed the field and adjusted the condition.
However, we need to set gen_last_frame_state in gen_clear_frame (or elsewhere in RETURN_VALUE). It's not enough to set it to RETURN before the _PyEval_EvalFrame call because generators can send to other generator. The "inner" generator will overwrite the gen_last_frame_state. (See the test failures on 4b82b8c)
Objects/genobject.c
Outdated
| FRAME_STATE_SUSPENDED(frame_state)); | ||
|
|
||
| if (!_Py_GEN_SET_FRAME_STATE(gen, frame_state, FRAME_EXECUTING)) { | ||
| goto retry; |
There was a problem hiding this comment.
No need for a goto. You can use do ... while (!_Py_GEN_SET_FRAME_STATE(gen, frame_state, FRAME_EXECUTING));
There was a problem hiding this comment.
I've switched from gotos to do/while loops
Also use it in both the default and FT builds.
Generator calls may be nested so it's not sufficient to set the field before the call to PyEval_Frame.
|
The changes look good. I'm concerned about the complexity of this for such an obscure use case, but I can't suggest a better way of doing it. |
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
| if (!_Py_GEN_TRY_SET_FRAME_STATE(gen, frame_state, FRAME_CLEARED)) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
This check seems to trigger a new compiler warning during build.
Objects/genobject.c:426:17: warning: code will never be executed [-Wunreachable-code]
continue;
Tested on MacOS with gcc.
There was a problem hiding this comment.
A few questions:
- Are you sure it's actually gcc? Which version?
- Are you adding extra compiler flags? I don't think we compile with
-Wunreachable-codeand I don't think it's enabled by-Walland-Wextra.
There was a problem hiding this comment.
Ah, I figured out why I wasn't seeing it.
- We enable
-Wunreachable-codeon Clang only and choose to do so as part of the./configure. It's only enabled in non-debug builds - I think
./configure -Ccaches the value, so I didn't see it when switching from debug to non-debug build.
There was a problem hiding this comment.
Yes, it was clang. I should have checked it more carefully. Sorry about that.
There was a problem hiding this comment.
Thanks for the catching this
…ythongh-142599) This makes generator frame state transitions atomic in the free threading build, which avoids segfaults when trying to execute a generator from multiple threads concurrently. There are still a few operations that aren't thread-safe and may crash if performed concurrently on the same generator/coroutine: * Accessing gi_yieldfrom/cr_await/ag_await * Accessing gi_frame/cr_frame/ag_frame * Async generator operations
This makes generator frame state transitions atomic in the free threading build, which avoids segfaults when trying to execute a generator from multiple threads concurrently.
There are still a few operations that aren't thread-safe and may crash if performed concurrently on the same generator/coroutine: