Skip to content

fix(ci): use uv for python deps, keep pip on 3.12 license-check leg#4636

Merged
Yicong-Huang merged 8 commits into
apache:mainfrom
Yicong-Huang:fix/restore-uv-pip-in-build
May 3, 2026
Merged

fix(ci): use uv for python deps, keep pip on 3.12 license-check leg#4636
Yicong-Huang merged 8 commits into
apache:mainfrom
Yicong-Huang:fix/restore-uv-pip-in-build

Conversation

@Yicong-Huang
Copy link
Copy Markdown
Contributor

@Yicong-Huang Yicong-Huang commented May 2, 2026

What changes were proposed in this PR?

#4521 had the python dep install in the scala and python matrix jobs on uv pip install --system for install-speed. #4597 unintentionally rewrote those lines back to stock pip install while inlining the binary license checks, and the regression has been carried forward by every subsequent rebase. Restore uv — but with a targeted carve-out for the leg that drives the binary license check.

Why the carve-out

The python job's 3.12 matrix entry is the only leg that runs pip-licenses and feeds the result into bin/licensing/check_binary_deps.py python. That tool compares the installed Python tree against amber/LICENSE-binary-python, which is generated with pip and tracks what the production image installs. uv and pip resolvers can land on different versions of unpinned transitives — if the 3.12 leg installs with uv, check_binary_deps.py would false-positive on resolver drift, forcing us to chase those drifts in LICENSE-binary-python (and diverge from production).

So: stock pip install on the 3.12 leg only; uv everywhere else.

Per-step shape

  • scala job → Install dependencies: uv pip install --system. Its license check is jar-only, so Python resolver differences don't matter here.
  • python job → Install dependencies: branches on matrix.python-version. 3.12 keeps pip install; 3.10, 3.11, 3.13 use uv pip install --system.
  • python job → Install dev dependencies: uv pip install --system. Runs post-snapshot, so uv is safe on all legs.

No behaviour change for the license check itself. Other legs gain install speed.

Any related issues, documentation, discussions?

Closes #4635. Restores #4521. Regression introduced by #4597.

How was this PR tested?

Will be exercised by this PR's own scala and python matrices. The expected signal:

  • scala job: install step uses uv, tests still run.
  • python 3.10 / 3.11 / 3.13 legs: install step uses uv.
  • python 3.12 leg: install step uses pip; pip-licenses manifest unchanged; check_binary_deps.py python passes.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7 (Claude Code)

@github-actions github-actions Bot added fix ci changes related to CI labels May 2, 2026
@Yicong-Huang Yicong-Huang added the release/v1.1.0-incubating back porting to release/v1.1.0-incubating label May 2, 2026
@Yicong-Huang
Copy link
Copy Markdown
Contributor Author

@bobbai00 can you help me with the license?

Copy link
Copy Markdown
Contributor

@aglinxinyuan aglinxinyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Yicong-Huang added a commit that referenced this pull request May 2, 2026
…lure (#4638)

### What changes were proposed in this PR?

Tighten the scala job in `build.yml`:

- Drop `Compile with sbt: sbt clean package` — its `package` output was
unused and it re-cleaned a tree the dist step had just compiled.
- Drop the leading `clean;` from the dist step so it can reuse the lint
compile.
- Merge `scalafmt`, `scalafix`, and all per-module `dist` commands into
a single `sbt` invocation with each as its own argument, so the whole
chain runs in one JVM and sbt exits at the first failing command.
- Move `Create Databases` ahead of any sbt step (the JOOQ source
generators connect to `texera_db` during compile).
- Move `Install dependencies` (pip) just before `Run backend tests`,
since only the test step needs the python deps.

New step order:

```
Create Databases
Setup sbt launcher / coursier cache
sbt scalafmtCheckAll "scalafixAll --check" <Service>/dist ...   # one JVM, fail-fast
Unzip / license check / audit
Install dependencies (pip)
Create texera_db_for_test_cases
Set docker-java API version
Run backend tests
```

### Any related issues, documentation, discussions?

Closes #4637.

### How was this PR tested?

Exercised by this PR's own scala matrix. Each individual command
(scalafmt, scalafix, dist, license check, audit, tests) is unchanged;
only ordering, the merged sbt invocation, and the removal of redundant
`sbt clean package` differ.

Timing comparison on the scala job, sbt-touching steps only (run
[25239784635](https://github.com/apache/texera/actions/runs/25239784635)
before, run
[25241165819](https://github.com/apache/texera/actions/runs/25241165819)
after):

| step | before | after |
|---|---|---|
| Lint with scalafmt | 45 s | (merged) |
| Build distributable bundles (`sbt 'clean; X/dist; ...'`) | 3 m 4 s |
(merged) |
| Compile with sbt (`sbt clean package`) | 1 m 26 s | removed |
| Lint with scalafix | 47 s | (merged) |
| **Combined `sbt scalafmtCheckAll "scalafixAll --check" X/dist ...`** |
— | **4 m 31 s** |
| sbt subtotal | **6 m 2 s** | **4 m 31 s** |

Net savings on the sbt portion ~1 m 30 s (matches the dropped redundant
compile plus one fewer sbt JVM cold-start). uv pip migration is
independent (#4636) and would shave another ~45 s off the python
`Install dependencies` step.

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
github-actions Bot pushed a commit that referenced this pull request May 2, 2026
…lure (#4638)

### What changes were proposed in this PR?

Tighten the scala job in `build.yml`:

- Drop `Compile with sbt: sbt clean package` — its `package` output was
unused and it re-cleaned a tree the dist step had just compiled.
- Drop the leading `clean;` from the dist step so it can reuse the lint
compile.
- Merge `scalafmt`, `scalafix`, and all per-module `dist` commands into
a single `sbt` invocation with each as its own argument, so the whole
chain runs in one JVM and sbt exits at the first failing command.
- Move `Create Databases` ahead of any sbt step (the JOOQ source
generators connect to `texera_db` during compile).
- Move `Install dependencies` (pip) just before `Run backend tests`,
since only the test step needs the python deps.

New step order:

```
Create Databases
Setup sbt launcher / coursier cache
sbt scalafmtCheckAll "scalafixAll --check" <Service>/dist ...   # one JVM, fail-fast
Unzip / license check / audit
Install dependencies (pip)
Create texera_db_for_test_cases
Set docker-java API version
Run backend tests
```

### Any related issues, documentation, discussions?

Closes #4637.

### How was this PR tested?

Exercised by this PR's own scala matrix. Each individual command
(scalafmt, scalafix, dist, license check, audit, tests) is unchanged;
only ordering, the merged sbt invocation, and the removal of redundant
`sbt clean package` differ.

Timing comparison on the scala job, sbt-touching steps only (run
[25239784635](https://github.com/apache/texera/actions/runs/25239784635)
before, run
[25241165819](https://github.com/apache/texera/actions/runs/25241165819)
after):

| step | before | after |
|---|---|---|
| Lint with scalafmt | 45 s | (merged) |
| Build distributable bundles (`sbt 'clean; X/dist; ...'`) | 3 m 4 s |
(merged) |
| Compile with sbt (`sbt clean package`) | 1 m 26 s | removed |
| Lint with scalafix | 47 s | (merged) |
| **Combined `sbt scalafmtCheckAll "scalafixAll --check" X/dist ...`** |
— | **4 m 31 s** |
| sbt subtotal | **6 m 2 s** | **4 m 31 s** |

Net savings on the sbt portion ~1 m 30 s (matches the dropped redundant
compile plus one fewer sbt JVM cold-start). uv pip migration is
independent (#4636) and would shave another ~45 s off the python
`Install dependencies` step.

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

(backported from commit 8a4f2dd)
apache#4521 had the python dep install in the scala and python matrix jobs
on `uv pip install --system` for speed. apache#4597 unintentionally rewrote
those lines back to stock pip while inlining the binary license
checks; the regression has been carried forward by every subsequent
rebase. Restore uv for speed.

The python job's 3.12 leg is the only one that drives the binary
license check (`pip-licenses` -> `check_binary_deps.py python`). Keep
stock pip on that leg so the resolved versions match
`amber/LICENSE-binary-python`, which is generated with pip and tracks
what the production image installs. uv and pip can resolve unpinned
transitives differently; without this carve-out the check would false-
positive on resolver drift, and we'd be forced to update LICENSE-
binary-python to chase the CI side (production still uses pip).

Other python legs (3.10, 3.11, 3.13) use uv. The scala job's binary
license check is jar-only, so it uses uv too. Dev deps install runs
post-snapshot so it can use uv on all legs.

Closes apache#4635
@Yicong-Huang Yicong-Huang force-pushed the fix/restore-uv-pip-in-build branch from fa8b81d to 874d03d Compare May 2, 2026 19:33
@Yicong-Huang Yicong-Huang changed the title fix(ci): restore uv pip in scala and python jobs fix(ci): use uv for python deps, keep pip on 3.12 license-check leg May 2, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 43.98%. Comparing base (f0e17c2) to head (8cf2ab8).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #4636      +/-   ##
============================================
+ Coverage     43.94%   43.98%   +0.03%     
- Complexity     2126     2196      +70     
============================================
  Files           957      957              
  Lines         34072    34941     +869     
  Branches       3753     3893     +140     
============================================
+ Hits          14974    15369     +395     
- Misses        18309    18769     +460     
- Partials        789      803      +14     
Flag Coverage Δ
access-control-service 28.12% <ø> (ø)
agent-service 33.49% <ø> (-0.24%) ⬇️
amber 42.98% <ø> (+0.40%) ⬆️
computing-unit-managing-service 0.00% <ø> (ø)
config-service 0.00% <ø> (ø)
file-service 32.40% <ø> (-0.85%) ⬇️
frontend 34.97% <ø> (-0.31%) ⬇️
python 86.30% <ø> (-0.17%) ⬇️
workflow-compiling-service 47.72% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Yicong-Huang Yicong-Huang enabled auto-merge (squash) May 2, 2026 19:55
@Yicong-Huang Yicong-Huang added the emergency Pull requests that need to be merged ASAP label May 3, 2026
@Yicong-Huang Yicong-Huang enabled auto-merge (squash) May 3, 2026 16:48
@Yicong-Huang Yicong-Huang disabled auto-merge May 3, 2026 17:35
Yicong-Huang added a commit that referenced this pull request May 3, 2026
… unique (#4868)

### What changes were proposed in this PR?

Backport [#4717](#4717) (commit
`6ae0c46312ca744b4f761c88f1ec172ba0d41d13` on `main`) onto
`release/v1.1.0-incubating`.

`ExecutorManager` previously used a per-instance `executor_version`
counter for the `udf-vN` tmp module name, so a fresh `ExecutorManager`
always produced `udf-v1` and collided with whatever `udf-v1` was already
cached in `sys.modules` from an earlier instance. The post-collision
`clear()` + `importlib.reload()` recovery silently returned a stale
class on Python 3.11. Lift the counter to a class-level
`itertools.count(1)` so module names are unique across every instance in
the same Python process; the recovery branch becomes unreachable and is
removed.

This unblocks #4636 (and any future `release/*`-labelled PR): the
auto-backport leg's
`core/runnables/test_main_loop.py::test_batch_dp_thread_can_process_batch`
was failing on the 3.11 matrix entry against this branch with
`AttributeError: 'TestOperator' object has no attribute 'count'` (the
stale-class symptom), and the failed test left a non-daemon
`main_loop_thread` alive that prevented pytest from exiting — surfacing
as a 30+ minute hang.

### Any related issues, documentation, discussions?

Backports #4717. Original issue: #4705.

### How was this PR tested?

Cherry-pick of an already-reviewed and merged commit. One conflict in
`test_executor_manager.py` (release branch lacked the trailing
`TestUpdateExecutor` test class that #4717 introduced); resolved by
taking the incoming version verbatim. Local syntax + `ruff format
--check` pass on both modified files. CI on this PR will exercise the
change against the release branch's full Python matrix.

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (claude-opus-4-7)
@Yicong-Huang Yicong-Huang enabled auto-merge (squash) May 3, 2026 18:15
@Yicong-Huang Yicong-Huang merged commit a3d43db into apache:main May 3, 2026
67 of 69 checks passed
@Yicong-Huang Yicong-Huang deleted the fix/restore-uv-pip-in-build branch May 3, 2026 18:35
Yicong-Huang added a commit that referenced this pull request May 3, 2026
…4636)

## What changes were proposed in this PR?

#4521 had the python dep install in the scala and python matrix jobs on
`uv pip install --system` for install-speed. #4597 unintentionally
rewrote those lines back to stock `pip install` while inlining the
binary license checks, and the regression has been carried forward by
every subsequent rebase. Restore uv — but with a targeted carve-out for
the leg that drives the binary license check.

### Why the carve-out

The python job's `3.12` matrix entry is the only leg that runs
`pip-licenses` and feeds the result into
`bin/licensing/check_binary_deps.py python`. That tool compares the
installed Python tree against `amber/LICENSE-binary-python`, which is
generated **with pip** and tracks what the production image installs. uv
and pip resolvers can land on different versions of unpinned transitives
— if the 3.12 leg installs with uv, `check_binary_deps.py` would
false-positive on resolver drift, forcing us to chase those drifts in
`LICENSE-binary-python` (and diverge from production).

So: stock `pip install` on the 3.12 leg only; uv everywhere else.

### Per-step shape

- **scala job → Install dependencies**: `uv pip install --system`. Its
license check is jar-only, so Python resolver differences don't matter
here.
- **python job → Install dependencies**: branches on
`matrix.python-version`. `3.12` keeps `pip install`; `3.10`, `3.11`,
`3.13` use `uv pip install --system`.
- **python job → Install dev dependencies**: `uv pip install --system`.
Runs post-snapshot, so uv is safe on all legs.

No behaviour change for the license check itself. Other legs gain
install speed.

## Any related issues, documentation, discussions?

Closes #4635. Restores #4521. Regression introduced by #4597.

## How was this PR tested?

Will be exercised by this PR's own scala and python matrices. The
expected signal:

- [x] scala job: install step uses uv, tests still run.
- [x] python 3.10 / 3.11 / 3.13 legs: install step uses uv.
- [x] python 3.12 leg: install step uses pip; pip-licenses manifest
unchanged; `check_binary_deps.py python` passes.

## Was this PR authored or co-authored using generative AI tooling?

(backported from commit a3d43db)

Generated-by: Claude Opus 4.7 (Claude Code)
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 3, 2026

Backport to release/v1.1.0-incubating succeeded as dad6380. Run

SarahAsad23 pushed a commit to SarahAsad23/texera that referenced this pull request May 4, 2026
…lure (apache#4638)

### What changes were proposed in this PR?

Tighten the scala job in `build.yml`:

- Drop `Compile with sbt: sbt clean package` — its `package` output was
unused and it re-cleaned a tree the dist step had just compiled.
- Drop the leading `clean;` from the dist step so it can reuse the lint
compile.
- Merge `scalafmt`, `scalafix`, and all per-module `dist` commands into
a single `sbt` invocation with each as its own argument, so the whole
chain runs in one JVM and sbt exits at the first failing command.
- Move `Create Databases` ahead of any sbt step (the JOOQ source
generators connect to `texera_db` during compile).
- Move `Install dependencies` (pip) just before `Run backend tests`,
since only the test step needs the python deps.

New step order:

```
Create Databases
Setup sbt launcher / coursier cache
sbt scalafmtCheckAll "scalafixAll --check" <Service>/dist ...   # one JVM, fail-fast
Unzip / license check / audit
Install dependencies (pip)
Create texera_db_for_test_cases
Set docker-java API version
Run backend tests
```

### Any related issues, documentation, discussions?

Closes apache#4637.

### How was this PR tested?

Exercised by this PR's own scala matrix. Each individual command
(scalafmt, scalafix, dist, license check, audit, tests) is unchanged;
only ordering, the merged sbt invocation, and the removal of redundant
`sbt clean package` differ.

Timing comparison on the scala job, sbt-touching steps only (run
[25239784635](https://github.com/apache/texera/actions/runs/25239784635)
before, run
[25241165819](https://github.com/apache/texera/actions/runs/25241165819)
after):

| step | before | after |
|---|---|---|
| Lint with scalafmt | 45 s | (merged) |
| Build distributable bundles (`sbt 'clean; X/dist; ...'`) | 3 m 4 s |
(merged) |
| Compile with sbt (`sbt clean package`) | 1 m 26 s | removed |
| Lint with scalafix | 47 s | (merged) |
| **Combined `sbt scalafmtCheckAll "scalafixAll --check" X/dist ...`** |
— | **4 m 31 s** |
| sbt subtotal | **6 m 2 s** | **4 m 31 s** |

Net savings on the sbt portion ~1 m 30 s (matches the dropped redundant
compile plus one fewer sbt JVM cold-start). uv pip migration is
independent (apache#4636) and would shave another ~45 s off the python
`Install dependencies` step.

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci changes related to CI emergency Pull requests that need to be merged ASAP fix release/v1.1.0-incubating back porting to release/v1.1.0-incubating

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Restore uv pip install in scala and python CI jobs

3 participants