gh-115832: Fix instrumentation version mismatch during interpreter shutdown by swtaarrs · Pull Request #115856 · python/cpython

swtaarrs · 2024-02-23T15:50:03Z

In 0749244d13412d, I introduced a bug to interpreter_clear(): it sets interp->ceval.instrumentation_version to 0, without making the corresponding change to tstate->eval_breaker (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and this version check in bytecodes.c
will see a different result than this one in instrumentation.c, causing an infinite loop.

The fix itself is straightforward, and is what I should've done in interpreter_clear() in the first place: also clear tstate->eval_breaker when clearing interp->ceval.instrumentation_version. I also restored a comment that I'm not sure why I deleted in the original commit.

To make bugs of this type less likely in the future, I changed instrumentation.c:global_version() to read the version from a PyThreadState* rather than a PyInterpreterState*, so it's reading the version from the same location as the interpreter loop. This had some fan-out effects on its transitive callers, although most of them already had the current tstate available.

Issue: Coverage.py test suite failure since commit 0749244d134 #115832

…ter shutdown In python/cpython@0749244d13412d, I introduced a bug to `interpreter_clear()`: it sets `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and [this version check in bytecodes.c](https://github.com/python/cpython/blob/4ee6bdfbaa792a3aa93c65c2022a89bd2d1e0894/Python/bytecodes.c#L147-L152) will see a different result than [this one in instrumentation.c](https://github.com/python/cpython/blob/4ee6bdfbaa792a3aa93c65c2022a89bd2d1e0894/Python/instrumentation.c#L894-L895), causing an infinite loop. The fix itself is straightforward, and is what I should've done in `interpreter_clear()` in the first place: also clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`. I also restored a comment that I'm not sure why I deleted in the original commit. To make bugs of this type less likely in the future, I changed `instrumentation.c:global_version()` to read the version from a `PyThreadState*` rather than a `PyInterpreterState*`, so it's reading the version from the same location as the interpreter loop. This had some fan-out effects on its transitive callers, although most of them already had the current tstate availale. - Issue: pythongh-115832

swtaarrs · 2024-02-23T15:51:09Z

I think this one can skip news (since it should've been part of the original commit), but let me know if anyone disagrees.

swtaarrs · 2024-02-23T18:11:05Z

The test failures look real and related to this change. I'm investigating.

…nce it doesn't have an up-to-date monitoring version

colesbury

This generally looks good to me, but there are some merge conflicts that need to be resolved.

I am a bit unsure about the switch of global_version to use the PyThreadState's eval_breaker instead of the interpreter's instrumentation_version. The use of instrumentation_version seemed a bit more natural as the authoritative source, and a smaller bug fix change seems preferable too.

Would it be possible to add an assertion in global_version() that the two are the same? With the GIL, the setting of the thread and interpreter values should appear atomic. With the GIL disabled, I think we'd want a stop-the-world pause when we enable instrumentation.

swtaarrs · 2024-02-28T20:22:18Z

I am a bit unsure about the switch of global_version to use the PyThreadState's eval_breaker instead of the interpreter's instrumentation_version. The use of instrumentation_version seemed a bit more natural as the authoritative source, and a smaller bug fix change seems preferable too.

Would it be possible to add an assertion in global_version() that the two are the same?

Makes sense, and yeah that should be a straightforward change.

…he versions match in global_version()

swtaarrs · 2024-03-01T18:00:11Z

The latest round of merge conflicts was from #116013, but this should be good to go again.

bedevere-bot · 2024-03-01T18:12:12Z

🤖 New build scheduled with the buildbot fleet by @colesbury for commit 7a9c81a 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

…ter shutdown (python#115856) A previous commit introduced a bug to `interpreter_clear()`: it set `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and the version check in bytecodes.c will see a different result than the one in instrumentation.c causing an infinite loop. The fix itself is straightforward: clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`.

swtaarrs requested review from ericsnowcurrently, gvanrossum and markshannon as code owners February 23, 2024 15:50

bedevere-app bot added the awaiting review label Feb 23, 2024

bedevere-app bot mentioned this pull request Feb 23, 2024

Coverage.py test suite failure since commit 0749244d134 #115832

Closed

AlexWaygood added the skip news label Feb 23, 2024

swtaarrs marked this pull request as draft February 23, 2024 18:50

bedevere-app bot removed the awaiting review label Feb 23, 2024

Fix failing tests: don't pass inactive tstate to _Py_Instrument(), si…

8552fe0

…nce it doesn't have an up-to-date monitoring version

swtaarrs marked this pull request as ready for review February 23, 2024 19:37

bedevere-app bot added the awaiting review label Feb 23, 2024

swtaarrs linked an issue Feb 27, 2024 that may be closed by this pull request

Coverage.py test suite failure since commit 0749244d134 #115832

Closed

swtaarrs requested a review from colesbury February 27, 2024 22:42

colesbury reviewed Feb 28, 2024

View reviewed changes

swtaarrs added 3 commits February 28, 2024 12:33

Move everything back to operate on interp instead of tstate, assert t…

2d261f5

…he versions match in global_version()

Merge remote-tracking branch 'upstream/main' into cpython-coverage.py

a1b2f61

Merge remote-tracking branch 'upstream/main' into cpython-coverage.py

7a9c81a

colesbury added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 1, 2024

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 1, 2024

colesbury approved these changes Mar 1, 2024

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting review labels Mar 1, 2024

colesbury merged commit 0adfa84 into python:main Mar 4, 2024

bedevere-app bot removed the awaiting merge label Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-115832: Fix instrumentation version mismatch during interpreter shutdown#115856

gh-115832: Fix instrumentation version mismatch during interpreter shutdown#115856
colesbury merged 5 commits intopython:mainfrom
swtaarrs:cpython-coverage.py

swtaarrs commented Feb 23, 2024

Uh oh!

swtaarrs commented Feb 23, 2024

Uh oh!

swtaarrs commented Feb 23, 2024

Uh oh!

colesbury left a comment

Uh oh!

swtaarrs commented Feb 28, 2024

Uh oh!

swtaarrs commented Mar 1, 2024

Uh oh!

bedevere-bot commented Mar 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

swtaarrs commented Feb 23, 2024

Uh oh!

swtaarrs commented Feb 23, 2024

Uh oh!

swtaarrs commented Feb 23, 2024

Uh oh!

colesbury left a comment

Choose a reason for hiding this comment

Uh oh!

swtaarrs commented Feb 28, 2024

Uh oh!

swtaarrs commented Mar 1, 2024

Uh oh!

bedevere-bot commented Mar 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants