fix(ci): use uv for python deps, keep pip on 3.12 license-check leg#4636
Merged
Yicong-Huang merged 8 commits intoMay 3, 2026
Merged
Conversation
Contributor
Author
|
@bobbai00 can you help me with the license? |
Yicong-Huang
added a commit
that referenced
this pull request
May 2, 2026
…lure (#4638) ### What changes were proposed in this PR? Tighten the scala job in `build.yml`: - Drop `Compile with sbt: sbt clean package` — its `package` output was unused and it re-cleaned a tree the dist step had just compiled. - Drop the leading `clean;` from the dist step so it can reuse the lint compile. - Merge `scalafmt`, `scalafix`, and all per-module `dist` commands into a single `sbt` invocation with each as its own argument, so the whole chain runs in one JVM and sbt exits at the first failing command. - Move `Create Databases` ahead of any sbt step (the JOOQ source generators connect to `texera_db` during compile). - Move `Install dependencies` (pip) just before `Run backend tests`, since only the test step needs the python deps. New step order: ``` Create Databases Setup sbt launcher / coursier cache sbt scalafmtCheckAll "scalafixAll --check" <Service>/dist ... # one JVM, fail-fast Unzip / license check / audit Install dependencies (pip) Create texera_db_for_test_cases Set docker-java API version Run backend tests ``` ### Any related issues, documentation, discussions? Closes #4637. ### How was this PR tested? Exercised by this PR's own scala matrix. Each individual command (scalafmt, scalafix, dist, license check, audit, tests) is unchanged; only ordering, the merged sbt invocation, and the removal of redundant `sbt clean package` differ. Timing comparison on the scala job, sbt-touching steps only (run [25239784635](https://github.com/apache/texera/actions/runs/25239784635) before, run [25241165819](https://github.com/apache/texera/actions/runs/25241165819) after): | step | before | after | |---|---|---| | Lint with scalafmt | 45 s | (merged) | | Build distributable bundles (`sbt 'clean; X/dist; ...'`) | 3 m 4 s | (merged) | | Compile with sbt (`sbt clean package`) | 1 m 26 s | removed | | Lint with scalafix | 47 s | (merged) | | **Combined `sbt scalafmtCheckAll "scalafixAll --check" X/dist ...`** | — | **4 m 31 s** | | sbt subtotal | **6 m 2 s** | **4 m 31 s** | Net savings on the sbt portion ~1 m 30 s (matches the dropped redundant compile plus one fewer sbt JVM cold-start). uv pip migration is independent (#4636) and would shave another ~45 s off the python `Install dependencies` step. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7 Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
github-actions Bot
pushed a commit
that referenced
this pull request
May 2, 2026
…lure (#4638) ### What changes were proposed in this PR? Tighten the scala job in `build.yml`: - Drop `Compile with sbt: sbt clean package` — its `package` output was unused and it re-cleaned a tree the dist step had just compiled. - Drop the leading `clean;` from the dist step so it can reuse the lint compile. - Merge `scalafmt`, `scalafix`, and all per-module `dist` commands into a single `sbt` invocation with each as its own argument, so the whole chain runs in one JVM and sbt exits at the first failing command. - Move `Create Databases` ahead of any sbt step (the JOOQ source generators connect to `texera_db` during compile). - Move `Install dependencies` (pip) just before `Run backend tests`, since only the test step needs the python deps. New step order: ``` Create Databases Setup sbt launcher / coursier cache sbt scalafmtCheckAll "scalafixAll --check" <Service>/dist ... # one JVM, fail-fast Unzip / license check / audit Install dependencies (pip) Create texera_db_for_test_cases Set docker-java API version Run backend tests ``` ### Any related issues, documentation, discussions? Closes #4637. ### How was this PR tested? Exercised by this PR's own scala matrix. Each individual command (scalafmt, scalafix, dist, license check, audit, tests) is unchanged; only ordering, the merged sbt invocation, and the removal of redundant `sbt clean package` differ. Timing comparison on the scala job, sbt-touching steps only (run [25239784635](https://github.com/apache/texera/actions/runs/25239784635) before, run [25241165819](https://github.com/apache/texera/actions/runs/25241165819) after): | step | before | after | |---|---|---| | Lint with scalafmt | 45 s | (merged) | | Build distributable bundles (`sbt 'clean; X/dist; ...'`) | 3 m 4 s | (merged) | | Compile with sbt (`sbt clean package`) | 1 m 26 s | removed | | Lint with scalafix | 47 s | (merged) | | **Combined `sbt scalafmtCheckAll "scalafixAll --check" X/dist ...`** | — | **4 m 31 s** | | sbt subtotal | **6 m 2 s** | **4 m 31 s** | Net savings on the sbt portion ~1 m 30 s (matches the dropped redundant compile plus one fewer sbt JVM cold-start). uv pip migration is independent (#4636) and would shave another ~45 s off the python `Install dependencies` step. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7 Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> (backported from commit 8a4f2dd)
apache#4521 had the python dep install in the scala and python matrix jobs on `uv pip install --system` for speed. apache#4597 unintentionally rewrote those lines back to stock pip while inlining the binary license checks; the regression has been carried forward by every subsequent rebase. Restore uv for speed. The python job's 3.12 leg is the only one that drives the binary license check (`pip-licenses` -> `check_binary_deps.py python`). Keep stock pip on that leg so the resolved versions match `amber/LICENSE-binary-python`, which is generated with pip and tracks what the production image installs. uv and pip can resolve unpinned transitives differently; without this carve-out the check would false- positive on resolver drift, and we'd be forced to update LICENSE- binary-python to chase the CI side (production still uses pip). Other python legs (3.10, 3.11, 3.13) use uv. The scala job's binary license check is jar-only, so it uses uv too. Dev deps install runs post-snapshot so it can use uv on all legs. Closes apache#4635
fa8b81d to
874d03d
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4636 +/- ##
============================================
+ Coverage 43.94% 43.98% +0.03%
- Complexity 2126 2196 +70
============================================
Files 957 957
Lines 34072 34941 +869
Branches 3753 3893 +140
============================================
+ Hits 14974 15369 +395
- Misses 18309 18769 +460
- Partials 789 803 +14
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Yicong-Huang
added a commit
that referenced
this pull request
May 3, 2026
… unique (#4868) ### What changes were proposed in this PR? Backport [#4717](#4717) (commit `6ae0c46312ca744b4f761c88f1ec172ba0d41d13` on `main`) onto `release/v1.1.0-incubating`. `ExecutorManager` previously used a per-instance `executor_version` counter for the `udf-vN` tmp module name, so a fresh `ExecutorManager` always produced `udf-v1` and collided with whatever `udf-v1` was already cached in `sys.modules` from an earlier instance. The post-collision `clear()` + `importlib.reload()` recovery silently returned a stale class on Python 3.11. Lift the counter to a class-level `itertools.count(1)` so module names are unique across every instance in the same Python process; the recovery branch becomes unreachable and is removed. This unblocks #4636 (and any future `release/*`-labelled PR): the auto-backport leg's `core/runnables/test_main_loop.py::test_batch_dp_thread_can_process_batch` was failing on the 3.11 matrix entry against this branch with `AttributeError: 'TestOperator' object has no attribute 'count'` (the stale-class symptom), and the failed test left a non-daemon `main_loop_thread` alive that prevented pytest from exiting — surfacing as a 30+ minute hang. ### Any related issues, documentation, discussions? Backports #4717. Original issue: #4705. ### How was this PR tested? Cherry-pick of an already-reviewed and merged commit. One conflict in `test_executor_manager.py` (release branch lacked the trailing `TestUpdateExecutor` test class that #4717 introduced); resolved by taking the incoming version verbatim. Local syntax + `ruff format --check` pass on both modified files. CI on this PR will exercise the change against the release branch's full Python matrix. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (claude-opus-4-7)
Yicong-Huang
added a commit
that referenced
this pull request
May 3, 2026
…4636) ## What changes were proposed in this PR? #4521 had the python dep install in the scala and python matrix jobs on `uv pip install --system` for install-speed. #4597 unintentionally rewrote those lines back to stock `pip install` while inlining the binary license checks, and the regression has been carried forward by every subsequent rebase. Restore uv — but with a targeted carve-out for the leg that drives the binary license check. ### Why the carve-out The python job's `3.12` matrix entry is the only leg that runs `pip-licenses` and feeds the result into `bin/licensing/check_binary_deps.py python`. That tool compares the installed Python tree against `amber/LICENSE-binary-python`, which is generated **with pip** and tracks what the production image installs. uv and pip resolvers can land on different versions of unpinned transitives — if the 3.12 leg installs with uv, `check_binary_deps.py` would false-positive on resolver drift, forcing us to chase those drifts in `LICENSE-binary-python` (and diverge from production). So: stock `pip install` on the 3.12 leg only; uv everywhere else. ### Per-step shape - **scala job → Install dependencies**: `uv pip install --system`. Its license check is jar-only, so Python resolver differences don't matter here. - **python job → Install dependencies**: branches on `matrix.python-version`. `3.12` keeps `pip install`; `3.10`, `3.11`, `3.13` use `uv pip install --system`. - **python job → Install dev dependencies**: `uv pip install --system`. Runs post-snapshot, so uv is safe on all legs. No behaviour change for the license check itself. Other legs gain install speed. ## Any related issues, documentation, discussions? Closes #4635. Restores #4521. Regression introduced by #4597. ## How was this PR tested? Will be exercised by this PR's own scala and python matrices. The expected signal: - [x] scala job: install step uses uv, tests still run. - [x] python 3.10 / 3.11 / 3.13 legs: install step uses uv. - [x] python 3.12 leg: install step uses pip; pip-licenses manifest unchanged; `check_binary_deps.py python` passes. ## Was this PR authored or co-authored using generative AI tooling? (backported from commit a3d43db) Generated-by: Claude Opus 4.7 (Claude Code)
Contributor
|
Backport to |
SarahAsad23
pushed a commit
to SarahAsad23/texera
that referenced
this pull request
May 4, 2026
…lure (apache#4638) ### What changes were proposed in this PR? Tighten the scala job in `build.yml`: - Drop `Compile with sbt: sbt clean package` — its `package` output was unused and it re-cleaned a tree the dist step had just compiled. - Drop the leading `clean;` from the dist step so it can reuse the lint compile. - Merge `scalafmt`, `scalafix`, and all per-module `dist` commands into a single `sbt` invocation with each as its own argument, so the whole chain runs in one JVM and sbt exits at the first failing command. - Move `Create Databases` ahead of any sbt step (the JOOQ source generators connect to `texera_db` during compile). - Move `Install dependencies` (pip) just before `Run backend tests`, since only the test step needs the python deps. New step order: ``` Create Databases Setup sbt launcher / coursier cache sbt scalafmtCheckAll "scalafixAll --check" <Service>/dist ... # one JVM, fail-fast Unzip / license check / audit Install dependencies (pip) Create texera_db_for_test_cases Set docker-java API version Run backend tests ``` ### Any related issues, documentation, discussions? Closes apache#4637. ### How was this PR tested? Exercised by this PR's own scala matrix. Each individual command (scalafmt, scalafix, dist, license check, audit, tests) is unchanged; only ordering, the merged sbt invocation, and the removal of redundant `sbt clean package` differ. Timing comparison on the scala job, sbt-touching steps only (run [25239784635](https://github.com/apache/texera/actions/runs/25239784635) before, run [25241165819](https://github.com/apache/texera/actions/runs/25241165819) after): | step | before | after | |---|---|---| | Lint with scalafmt | 45 s | (merged) | | Build distributable bundles (`sbt 'clean; X/dist; ...'`) | 3 m 4 s | (merged) | | Compile with sbt (`sbt clean package`) | 1 m 26 s | removed | | Lint with scalafix | 47 s | (merged) | | **Combined `sbt scalafmtCheckAll "scalafixAll --check" X/dist ...`** | — | **4 m 31 s** | | sbt subtotal | **6 m 2 s** | **4 m 31 s** | Net savings on the sbt portion ~1 m 30 s (matches the dropped redundant compile plus one fewer sbt JVM cold-start). uv pip migration is independent (apache#4636) and would shave another ~45 s off the python `Install dependencies` step. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7 Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this PR?
#4521 had the python dep install in the scala and python matrix jobs on
uv pip install --systemfor install-speed. #4597 unintentionally rewrote those lines back to stockpip installwhile inlining the binary license checks, and the regression has been carried forward by every subsequent rebase. Restore uv — but with a targeted carve-out for the leg that drives the binary license check.Why the carve-out
The python job's
3.12matrix entry is the only leg that runspip-licensesand feeds the result intobin/licensing/check_binary_deps.py python. That tool compares the installed Python tree againstamber/LICENSE-binary-python, which is generated with pip and tracks what the production image installs. uv and pip resolvers can land on different versions of unpinned transitives — if the 3.12 leg installs with uv,check_binary_deps.pywould false-positive on resolver drift, forcing us to chase those drifts inLICENSE-binary-python(and diverge from production).So: stock
pip installon the 3.12 leg only; uv everywhere else.Per-step shape
uv pip install --system. Its license check is jar-only, so Python resolver differences don't matter here.matrix.python-version.3.12keepspip install;3.10,3.11,3.13useuv pip install --system.uv pip install --system. Runs post-snapshot, so uv is safe on all legs.No behaviour change for the license check itself. Other legs gain install speed.
Any related issues, documentation, discussions?
Closes #4635. Restores #4521. Regression introduced by #4597.
How was this PR tested?
Will be exercised by this PR's own scala and python matrices. The expected signal:
check_binary_deps.py pythonpasses.Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.7 (Claude Code)