Skip to content

fix(revert): "fix: add missing context switches for repeated state processing"#4552

Merged
aglinxinyuan merged 1 commit into
mainfrom
revert-4424-xinyuan-fix-state-processing
Apr 29, 2026
Merged

fix(revert): "fix: add missing context switches for repeated state processing"#4552
aglinxinyuan merged 1 commit into
mainfrom
revert-4424-xinyuan-fix-state-processing

Conversation

@aglinxinyuan
Copy link
Copy Markdown
Contributor

@aglinxinyuan aglinxinyuan commented Apr 29, 2026

Reverts #4424 since it might change the lifecycle of the Python engine.

Might be related to issue #4545

We don't know if #4424 really causes the issue, but reverting it for testing purposes.

@aglinxinyuan aglinxinyuan self-assigned this Apr 29, 2026
@aglinxinyuan aglinxinyuan changed the title Revert "fix: add missing context switches for repeated state processing" fix: revert "fix: add missing context switches for repeated state processing" Apr 29, 2026
@Yicong-Huang Yicong-Huang changed the title fix: revert "fix: add missing context switches for repeated state processing" revert:"fix: add missing context switches for repeated state processing" Apr 29, 2026
@Yicong-Huang Yicong-Huang changed the title revert:"fix: add missing context switches for repeated state processing" fix(revert):"fix: add missing context switches for repeated state processing" Apr 29, 2026
@Yicong-Huang Yicong-Huang changed the title fix(revert):"fix: add missing context switches for repeated state processing" fix(revert): "fix: add missing context switches for repeated state processing" Apr 29, 2026
@aglinxinyuan aglinxinyuan merged commit ae68fc9 into main Apr 29, 2026
15 of 20 checks passed
@aglinxinyuan aglinxinyuan deleted the revert-4424-xinyuan-fix-state-processing branch April 29, 2026 01:11
aglinxinyuan added a commit that referenced this pull request Apr 30, 2026
### What changes were proposed in this PR?

Restores reliable state-output emission for Python operators after the
#4552 revert. After this PR, both per-input-state outputs
(`Operator.process_state(...)`) and the end-of-input-port output
(`Operator.produce_state_on_finish(...)`) reach downstream channels.

`MainLoop.process_input_state` previously did two `_switch_context()`
calls with the read of `current_output_state` in between. The executor
only writes that field during the *second* switch — so `MainLoop` always
captured the previous cycle's value, and the finish-state set on
`EndChannel` ended up in `current_output_state` after `MainLoop` had
returned, never to be read again. This PR collapses the read to a single
switch + read-after, drops the duplicate post-init and end-of-body
switches in `DataProcessor.run`, and makes the run-loop's input dispatch
peek-then-consume so `current_internal_marker` keeps the atomic
single-consume semantics whose absence was the root cause of #4545.

<details>
<summary>History — third attempt at this fix</summary>

- #4421 reported that a Python operator could process its first state
input but not its second.
- PR #4424 added three `_switch_context()` calls to keep `MainLoop` and
`DataProcessor` in sync, closed #4421, but changed
`current_internal_marker` lifetime and broke the source-propagation case
in `ReconfigurationSpec` (#4545).
- PR #4547 tried to restore atomic marker consumption on top of #4424
and re-enabled the source-propagation case in `ReconfigurationSpec`. CI
continued to fail.
- PR #4552 reverted #4424 outright as a stop-gap. State-processing is
back to its pre-#4424 broken state — see #4559.

</details>

### Any related issues, documentation, discussions?

Fixes #4559. Follow-up to #4421 / #4424 / #4545 / #4547 / #4552.

### How was this PR tested?

Existing `core/runnables/test_main_loop.py` tests pass unchanged. Added
three new tests:

- `test_process_state_can_emit_multiple_states` — stub-level coverage of
the #4421 "second state not processed" scenario.
- `test_main_loop_thread_can_process_state` — full real-thread coverage
of state DataElements and `produce_state_on_finish` on `EndChannel`.
Times out on plain `main` (#4559); passes on this branch.
- `test_main_loop_thread_can_process_state_after_tuple` — coverage for
the mixed `tuple → state` input sequence.

`ReconfigurationSpec`'s source-propagation case (re-enabled in #4547)
should be re-run on this branch to confirm the new handshake does not
re-introduce #4545.

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Anthropic Claude Opus 4.7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants