perf(ci): use uv pip for faster Python dep install#4521
Merged
Yicong-Huang merged 2 commits intoApr 26, 2026
Conversation
750ba06 to
f0cfa99
Compare
The `Install dependencies` step in both the `scala` and `python` jobs of `.github/workflows/github-action-build.yml` ran `pip install -r requirements.txt`, which spent ~1m 21s in the `scala` job and 1m 12s – 4m 41s across the `python` matrix. Adding the built-in pip wheel cache helped only ~14s because the bottleneck was install (resolve, extract, write site-packages for ~230 packages), not download. Switch both jobs to `uv pip install --system`. uv is a Rust reimplementation of pip with no transitive deps, so installing it via `python -m pip install uv` adds only ~3s, and the same wheel set then installs in ~10s instead of ~70s. No third-party action is added (uv itself is fetched as a regular pip package), so this stays inside the ASF Infra GitHub Actions allowlist already used by this repo. Closes apache#4519 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
77c7b7a to
329cafb
Compare
Yicong-Huang
added a commit
that referenced
this pull request
Apr 26, 2026
### What changes were proposed in this PR? Removes two PyPI backport packages from `amber/requirements.txt`: - `typing==3.7.4.3` — backport of the `typing` stdlib module for Python <3.5. Has been part of the stdlib since 3.5. - `dataclasses==0.6` — backport of the `dataclasses` stdlib module for Python <3.7. Has been part of the stdlib since 3.7. This repo's CI matrix is Python 3.10–3.13, so both are obsolete. Installing them on supported versions is wasted CI time, and the PyPI `typing` package can shadow the stdlib version in subtle ways because its API is frozen at 3.7-era. `typing_extensions==4.14.1` (line 34, kept) is a different, still-maintained package that backports *new* typing features to older Pythons; it's correctly retained. ### Any related issues, documentation, discussions? Closes #4522. Surfaced during investigation in #4519/#4521. ### How was this PR tested? CI ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.7) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 28, 2026
Yicong-Huang
pushed a commit
to Yicong-Huang/texera
that referenced
this pull request
May 2, 2026
apache#4521 had the python dep install in the scala and python matrix jobs on `uv pip install --system` for speed. apache#4597 unintentionally rewrote those lines back to stock pip while inlining the binary license checks; the regression has been carried forward by every subsequent rebase. Restore uv for speed. The python job's 3.12 leg is the only one that drives the binary license check (`pip-licenses` -> `check_binary_deps.py python`). Keep stock pip on that leg so the resolved versions match `amber/LICENSE-binary-python`, which is generated with pip and tracks what the production image installs. uv and pip can resolve unpinned transitives differently; without this carve-out the check would false- positive on resolver drift, and we'd be forced to update LICENSE- binary-python to chase the CI side (production still uses pip). Other python legs (3.10, 3.11, 3.13) use uv. The scala job's binary license check is jar-only, so it uses uv too. Dev deps install runs post-snapshot so it can use uv on all legs. Closes apache#4635
Yicong-Huang
added a commit
that referenced
this pull request
May 3, 2026
…4636) ## What changes were proposed in this PR? #4521 had the python dep install in the scala and python matrix jobs on `uv pip install --system` for install-speed. #4597 unintentionally rewrote those lines back to stock `pip install` while inlining the binary license checks, and the regression has been carried forward by every subsequent rebase. Restore uv — but with a targeted carve-out for the leg that drives the binary license check. ### Why the carve-out The python job's `3.12` matrix entry is the only leg that runs `pip-licenses` and feeds the result into `bin/licensing/check_binary_deps.py python`. That tool compares the installed Python tree against `amber/LICENSE-binary-python`, which is generated **with pip** and tracks what the production image installs. uv and pip resolvers can land on different versions of unpinned transitives — if the 3.12 leg installs with uv, `check_binary_deps.py` would false-positive on resolver drift, forcing us to chase those drifts in `LICENSE-binary-python` (and diverge from production). So: stock `pip install` on the 3.12 leg only; uv everywhere else. ### Per-step shape - **scala job → Install dependencies**: `uv pip install --system`. Its license check is jar-only, so Python resolver differences don't matter here. - **python job → Install dependencies**: branches on `matrix.python-version`. `3.12` keeps `pip install`; `3.10`, `3.11`, `3.13` use `uv pip install --system`. - **python job → Install dev dependencies**: `uv pip install --system`. Runs post-snapshot, so uv is safe on all legs. No behaviour change for the license check itself. Other legs gain install speed. ## Any related issues, documentation, discussions? Closes #4635. Restores #4521. Regression introduced by #4597. ## How was this PR tested? Will be exercised by this PR's own scala and python matrices. The expected signal: - [x] scala job: install step uses uv, tests still run. - [x] python 3.10 / 3.11 / 3.13 legs: install step uses uv. - [x] python 3.12 leg: install step uses pip; pip-licenses manifest unchanged; `check_binary_deps.py python` passes. ## Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7 (Claude Code)
Yicong-Huang
added a commit
that referenced
this pull request
May 3, 2026
…4636) ## What changes were proposed in this PR? #4521 had the python dep install in the scala and python matrix jobs on `uv pip install --system` for install-speed. #4597 unintentionally rewrote those lines back to stock `pip install` while inlining the binary license checks, and the regression has been carried forward by every subsequent rebase. Restore uv — but with a targeted carve-out for the leg that drives the binary license check. ### Why the carve-out The python job's `3.12` matrix entry is the only leg that runs `pip-licenses` and feeds the result into `bin/licensing/check_binary_deps.py python`. That tool compares the installed Python tree against `amber/LICENSE-binary-python`, which is generated **with pip** and tracks what the production image installs. uv and pip resolvers can land on different versions of unpinned transitives — if the 3.12 leg installs with uv, `check_binary_deps.py` would false-positive on resolver drift, forcing us to chase those drifts in `LICENSE-binary-python` (and diverge from production). So: stock `pip install` on the 3.12 leg only; uv everywhere else. ### Per-step shape - **scala job → Install dependencies**: `uv pip install --system`. Its license check is jar-only, so Python resolver differences don't matter here. - **python job → Install dependencies**: branches on `matrix.python-version`. `3.12` keeps `pip install`; `3.10`, `3.11`, `3.13` use `uv pip install --system`. - **python job → Install dev dependencies**: `uv pip install --system`. Runs post-snapshot, so uv is safe on all legs. No behaviour change for the license check itself. Other legs gain install speed. ## Any related issues, documentation, discussions? Closes #4635. Restores #4521. Regression introduced by #4597. ## How was this PR tested? Will be exercised by this PR's own scala and python matrices. The expected signal: - [x] scala job: install step uses uv, tests still run. - [x] python 3.10 / 3.11 / 3.13 legs: install step uses uv. - [x] python 3.12 leg: install step uses pip; pip-licenses manifest unchanged; `check_binary_deps.py python` passes. ## Was this PR authored or co-authored using generative AI tooling? (backported from commit a3d43db) Generated-by: Claude Opus 4.7 (Claude Code)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this PR?
Both the
scalaandpythonjobs in.github/workflows/github-action-build.ymlinstall Python dependencies viapip install -r requirements.txt. That step alone takes:scalajob: ~1m 21spythonmatrix: 1m 12s – 4m 41s (3.13 is slowest because some packages lack prebuilt wheels for it)An earlier attempt on this PR turned on the built-in pip wheel cache (
actions/setup-python+cache: 'pip') and saw only ~14s saved — the bottleneck is install (resolve + extract + write site-packages for ~230 packages), not download.This PR switches both jobs to
uv pip install --system. uv is a Rust reimplementation of pip with no transitive deps, so installing it viapython -m pip install uvadds only ~3s, and the same wheel set then installs in ~10s instead of ~70s on the same runner.No new third-party GitHub Action is added — uv is fetched as a regular pip package — so this stays within the ASF Infra GitHub Actions allowlist already used by this repo (
sbt/setup-sbt,coursier/cache-action,docker/*,amannn/*,apache/*).Any related issues, documentation, discussions?
Closes #4519. Companion to #4508 (which combined the two lint sbt invocations).
How was this PR tested?
CI on this PR — comparing
Install dependenciesstep time before vs after on bothscalaandpython (3.10|3.11|3.12|3.13)jobs. Earlier exploratory commits on this branch tried cachingtarget/scala-2.13/{classes,zinc,src_managed}directly; that brokescalafixbecause zinc skipped the compile that produces the SemanticDB files scalafix needs, so the sbt-target cache was dropped.Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.7)