gh-115832: Fix instrumentation version mismatch during interpreter shutdown#115856
gh-115832: Fix instrumentation version mismatch during interpreter shutdown#115856colesbury merged 5 commits intopython:mainfrom
Conversation
…ter shutdown In python/cpython@0749244d13412d, I introduced a bug to `interpreter_clear()`: it sets `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and [this version check in bytecodes.c](https://github.com/python/cpython/blob/4ee6bdfbaa792a3aa93c65c2022a89bd2d1e0894/Python/bytecodes.c#L147-L152) will see a different result than [this one in instrumentation.c](https://github.com/python/cpython/blob/4ee6bdfbaa792a3aa93c65c2022a89bd2d1e0894/Python/instrumentation.c#L894-L895), causing an infinite loop. The fix itself is straightforward, and is what I should've done in `interpreter_clear()` in the first place: also clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`. I also restored a comment that I'm not sure why I deleted in the original commit. To make bugs of this type less likely in the future, I changed `instrumentation.c:global_version()` to read the version from a `PyThreadState*` rather than a `PyInterpreterState*`, so it's reading the version from the same location as the interpreter loop. This had some fan-out effects on its transitive callers, although most of them already had the current tstate availale. - Issue: pythongh-115832
|
I think this one can skip news (since it should've been part of the original commit), but let me know if anyone disagrees. |
|
The test failures look real and related to this change. I'm investigating. |
…nce it doesn't have an up-to-date monitoring version
colesbury
left a comment
There was a problem hiding this comment.
This generally looks good to me, but there are some merge conflicts that need to be resolved.
I am a bit unsure about the switch of global_version to use the PyThreadState's eval_breaker instead of the interpreter's instrumentation_version. The use of instrumentation_version seemed a bit more natural as the authoritative source, and a smaller bug fix change seems preferable too.
Would it be possible to add an assertion in global_version() that the two are the same? With the GIL, the setting of the thread and interpreter values should appear atomic. With the GIL disabled, I think we'd want a stop-the-world pause when we enable instrumentation.
Makes sense, and yeah that should be a straightforward change. |
|
The latest round of merge conflicts was from #116013, but this should be good to go again. |
|
🤖 New build scheduled with the buildbot fleet by @colesbury for commit 7a9c81a 🤖 If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
…ter shutdown (python#115856) A previous commit introduced a bug to `interpreter_clear()`: it set `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and the version check in bytecodes.c will see a different result than the one in instrumentation.c causing an infinite loop. The fix itself is straightforward: clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`.
…ter shutdown (python#115856) A previous commit introduced a bug to `interpreter_clear()`: it set `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and the version check in bytecodes.c will see a different result than the one in instrumentation.c causing an infinite loop. The fix itself is straightforward: clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`.
…ter shutdown (python#115856) A previous commit introduced a bug to `interpreter_clear()`: it set `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and the version check in bytecodes.c will see a different result than the one in instrumentation.c causing an infinite loop. The fix itself is straightforward: clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`.
In 0749244d13412d, I introduced a bug to
interpreter_clear(): it setsinterp->ceval.instrumentation_versionto 0, without making the corresponding change totstate->eval_breaker(which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and this version check in bytecodes.cwill see a different result than this one in instrumentation.c, causing an infinite loop.
The fix itself is straightforward, and is what I should've done in
interpreter_clear()in the first place: also cleartstate->eval_breakerwhen clearinginterp->ceval.instrumentation_version. I also restored a comment that I'm not sure why I deleted in the original commit.To make bugs of this type less likely in the future, I changed
instrumentation.c:global_version()to read the version from aPyThreadState*rather than aPyInterpreterState*, so it's reading the version from the same location as the interpreter loop. This had some fan-out effects on its transitive callers, although most of them already had the current tstate available.