Skip to content

fix(pyamber): make ExecutorManager module names process-globally unique#4717

Merged
Yicong-Huang merged 7 commits into
apache:mainfrom
Yicong-Huang:fix/executor-manager-module-name-singleton
May 3, 2026
Merged

fix(pyamber): make ExecutorManager module names process-globally unique#4717
Yicong-Huang merged 7 commits into
apache:mainfrom
Yicong-Huang:fix/executor-manager-module-name-singleton

Conversation

@Yicong-Huang
Copy link
Copy Markdown
Contributor

@Yicong-Huang Yicong-Huang commented May 3, 2026

What changes were proposed in this PR?

Lift ExecutorManager's tmp module-name counter from a per-instance attribute to a process-wide itertools.count(1). Generated module names (udf-vN) are now unique across every instance in the same Python process, so the if module_name in sys.modules: clear()+reload() recovery branch (which silently returned a stale class on 3.11) is unreachable and removed.

Production path is unchanged — a worker process holds exactly one ExecutorManager and its counter was already monotonic across reconfigures.

Any related issues, documentation, discussions?

Closes #4705.

How was this PR tested?

sbt 'WorkflowExecutionService / Test / testOnly test_executor_manager test_main_loop' — 23/23 pass. The combined run that previously leaked TestOperator from test_executor_manager.py into test_main_loop.py's fixtures is now clean.

test_executor_manager.py updated for the new semantics: assertions on executor_version == 0/1 and operator_module_name == "udf-v1" change to relative-ordering checks since the counter's absolute starting value depends on prior tests in the session.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.7, 1M context)

@Yicong-Huang Yicong-Huang changed the title fix(amber): make ExecutorManager module names process-globally unique fix(pyamber): make ExecutorManager module names process-globally unique May 3, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 43.66%. Comparing base (e551e1b) to head (3d2af92).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #4717      +/-   ##
============================================
- Coverage     43.68%   43.66%   -0.02%     
+ Complexity     2087     2086       -1     
============================================
  Files           957      957              
  Lines         34077    34072       -5     
  Branches       3753     3753              
============================================
- Hits          14887    14879       -8     
- Misses        18395    18397       +2     
- Partials        795      796       +1     
Flag Coverage Δ
amber 41.96% <ø> (-0.02%) ⬇️
python 86.46% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes cross-test contamination in PyAmber by making ExecutorManager’s temporary module names process-globally unique, preventing sys.modules name collisions that could cause stale operator classes to be reused (notably on Python 3.11).

Changes:

  • Replace per-ExecutorManager module-name/version counter with a class-level itertools.count(1) counter.
  • Remove the sys.modules “module already loaded” clear+reload branch since module-name collisions should no longer occur.
  • Update test_executor_manager.py assertions to validate module-name shape and monotonic ordering rather than fixed udf-v1 / version values.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
amber/src/main/python/core/architecture/managers/executor_manager.py Makes tmp module names process-unique via a class-level counter and simplifies import logic by removing the reload path.
amber/src/main/python/core/architecture/managers/test_executor_manager.py Adjusts tests to match process-wide counter semantics and checks monotonic module naming.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Each ExecutorManager instance previously started its own version
counter at 0, so the first module any instance loaded was always
udf-v1. With multiple instances in the same Python process, the
second instance's import_module hits the "module already loaded"
branch, the clear()+reload() recovery path then falls through a
3.11-specific importlib corner case that returns a stale class
(see apache#4705) — typically TestOperator from test_executor_manager.py
leaking into test_main_loop.py.

Lift the counter to a process-wide itertools.count, so module names
never collide and the import path always lands in the "fresh module"
branch. The clear()+reload() limb is now unreachable in practice
and is removed to avoid future readers thinking it's load-bearing.

Production behavior is unchanged: a single Python worker process
already has exactly one ExecutorManager and reconfigure (which
calls update_executor) already incremented the per-instance counter
so it never collided in production. The if-branch was effectively
test-only dead code — but its existence destabilized the 3.11 leg
of every PR.

Closes apache#4705.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Yicong-Huang Yicong-Huang force-pushed the fix/executor-manager-module-name-singleton branch from feaf7ae to e530573 Compare May 3, 2026 05:03
@Yicong-Huang Yicong-Huang disabled auto-merge May 3, 2026 06:45
aglinxinyuan and others added 2 commits May 3, 2026 01:21
…teExecutor

The previous commit removed the per-instance executor_version
attribute but left two TestUpdateExecutor cases still reading it:

- test_update_increments_executor_version_and_module_name asserted
  on the absolute counter value before/after update; rewrite as
  test_update_advances_module_name_monotonically and check that
  the udf-vN suffix increases by one (absolute starting value
  depends on prior tests in the same pytest session).
- test_repeated_updates_keep_carrying_the_running_state's final
  executor_version == 3 assertion is incidental to the running-
  state contract, drop it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Yicong-Huang Yicong-Huang enabled auto-merge (squash) May 3, 2026 16:30
@Yicong-Huang Yicong-Huang disabled auto-merge May 3, 2026 16:30
@Yicong-Huang Yicong-Huang enabled auto-merge (squash) May 3, 2026 16:31
@Yicong-Huang Yicong-Huang added the emergency Pull requests that need to be merged ASAP label May 3, 2026
@Yicong-Huang Yicong-Huang merged commit 6ae0c46 into apache:main May 3, 2026
15 checks passed
@Yicong-Huang Yicong-Huang deleted the fix/executor-manager-module-name-singleton branch May 3, 2026 16:45
Yicong-Huang added a commit that referenced this pull request May 3, 2026
… unique (#4868)

### What changes were proposed in this PR?

Backport [#4717](#4717) (commit
`6ae0c46312ca744b4f761c88f1ec172ba0d41d13` on `main`) onto
`release/v1.1.0-incubating`.

`ExecutorManager` previously used a per-instance `executor_version`
counter for the `udf-vN` tmp module name, so a fresh `ExecutorManager`
always produced `udf-v1` and collided with whatever `udf-v1` was already
cached in `sys.modules` from an earlier instance. The post-collision
`clear()` + `importlib.reload()` recovery silently returned a stale
class on Python 3.11. Lift the counter to a class-level
`itertools.count(1)` so module names are unique across every instance in
the same Python process; the recovery branch becomes unreachable and is
removed.

This unblocks #4636 (and any future `release/*`-labelled PR): the
auto-backport leg's
`core/runnables/test_main_loop.py::test_batch_dp_thread_can_process_batch`
was failing on the 3.11 matrix entry against this branch with
`AttributeError: 'TestOperator' object has no attribute 'count'` (the
stale-class symptom), and the failed test left a non-daemon
`main_loop_thread` alive that prevented pytest from exiting — surfacing
as a 30+ minute hang.

### Any related issues, documentation, discussions?

Backports #4717. Original issue: #4705.

### How was this PR tested?

Cherry-pick of an already-reviewed and merged commit. One conflict in
`test_executor_manager.py` (release branch lacked the trailing
`TestUpdateExecutor` test class that #4717 introduced); resolved by
taking the incoming version verbatim. Local syntax + `ruff format
--check` pass on both modified files. CI on this PR will exercise the
change against the release branch's full Python matrix.

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (claude-opus-4-7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

emergency Pull requests that need to be merged ASAP engine fix python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ExecutorManager re-uses fixed module names; sys.modules pollution leaks TestOperator into other specs on Python 3.11

4 participants