Skip to content

perf(ci): combine sbt lint steps; keep test step separate#4508

Merged
Yicong-Huang merged 2 commits into
apache:mainfrom
Yicong-Huang:perf/ci-combine-sbt-steps
Apr 26, 2026
Merged

perf(ci): combine sbt lint steps; keep test step separate#4508
Yicong-Huang merged 2 commits into
apache:mainfrom
Yicong-Huang:perf/ci-combine-sbt-steps

Conversation

@Yicong-Huang
Copy link
Copy Markdown
Contributor

@Yicong-Huang Yicong-Huang commented Apr 26, 2026

What changes were proposed in this PR?

The scala job in .github/workflows/github-action-build.yml ran four separate sbt invocations:

  • sbt scalafmtCheckAll
  • sbt clean package
  • sbt "scalafixAll --check"
  • sbt test

Each one paid the same cold-start cost (launcher boot, ~19,160 settings, plugin and dependency resolution) — about 12–15s on the CI runner. Across four invocations, that's ~36–60s of pure repeated startup.

This PR combines the two lint steps into sbt 'scalafmtCheckAll; scalafixAll --check' and drops the standalone clean package: test already triggers compile via the sbt task graph, and a fresh CI runner has no stale target/ to clean. sbt test stays as its own step so the GitHub Actions UI still surfaces test runtime distinctly.

Non-sbt setup steps (Create Databases, Set docker-java API version) are moved above the sbt steps so they run before either invocation needs them.

Any related issues, documentation, discussions?

Closes #4507

How was this PR tested?

Local measurement with sbt-launcher and target/ warm:

step time
sbt scalafmtCheckAll alone 6.0s
sbt 'scalafixAll --check' alone 13.8s
sum of separate 19.8s
sbt 'scalafmtCheckAll; scalafixAll --check' combined 10.7s

Saved ~9s locally per pair, which is one sbt cold-start. CI's slower startup means this PR collapses two invocations into one (lint) and drops a third (clean package), saving ~24–36s on the scala job. Will be confirmed by CI on this PR.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.7)

@github-actions github-actions Bot added the ci changes related to CI label Apr 26, 2026
@Yicong-Huang Yicong-Huang self-assigned this Apr 26, 2026
The scala job ran four separate sbt invocations: scalafmtCheckAll,
clean package, scalafixAll --check, and test. Each one paid the same
cold-start cost (launcher boot, ~19,160 settings, plugin and
dependency resolution) — about 12-15s on the CI runner.

Combine the two lint steps into `sbt 'scalafmtCheckAll; scalafixAll
--check'` and drop the standalone `clean package` step (test triggers
compile via the standard sbt task graph; a fresh CI runner has no
stale target/ to clean). Keep `sbt test` as its own step so the
GitHub Actions UI still surfaces test runtime distinctly.

Move the non-sbt setup steps (Create Databases, Set docker-java)
above the sbt steps so they run before either invocation needs them.

Closes apache#4507
@Yicong-Huang Yicong-Huang force-pushed the perf/ci-combine-sbt-steps branch from a7289d2 to 691dc08 Compare April 26, 2026 01:15
@Yicong-Huang Yicong-Huang changed the title perf(ci): combine sbt CI steps into a single invocation perf(ci): combine sbt lint steps; keep test step separate Apr 26, 2026
@Yicong-Huang Yicong-Huang requested a review from zuozhiw April 26, 2026 01:20
@Yicong-Huang Yicong-Huang enabled auto-merge (squash) April 26, 2026 01:27
Copy link
Copy Markdown
Contributor

@zuozhiw zuozhiw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Yicong-Huang Yicong-Huang merged commit 04cf85a into apache:main Apr 26, 2026
12 checks passed
Yicong-Huang added a commit that referenced this pull request Apr 26, 2026
### What changes were proposed in this PR?

Both the `scala` and `python` jobs in
`.github/workflows/github-action-build.yml` install Python dependencies
via `pip install -r requirements.txt`. That step alone takes:

- `scala` job: **~1m 21s**
- `python` matrix: **1m 12s – 4m 41s** (3.13 is slowest because some
packages lack prebuilt wheels for it)

An earlier attempt on this PR turned on the built-in pip wheel cache
(`actions/setup-python` + `cache: 'pip'`) and saw only ~14s saved — the
bottleneck is install (resolve + extract + write site-packages for ~230
packages), not download.

This PR switches both jobs to `uv pip install --system`. uv is a Rust
reimplementation of pip with no transitive deps, so installing it via
`python -m pip install uv` adds only ~3s, and the same wheel set then
installs in ~10s instead of ~70s on the same runner.

No new third-party GitHub Action is added — uv is fetched as a regular
pip package — so this stays within the ASF Infra GitHub Actions
allowlist already used by this repo (`sbt/setup-sbt`,
`coursier/cache-action`, `docker/*`, `amannn/*`, `apache/*`).

### Any related issues, documentation, discussions?

Closes #4519. Companion to #4508 (which combined the two lint sbt
invocations).

### How was this PR tested?

CI on this PR — comparing `Install dependencies` step time before vs
after on both `scala` and `python (3.10|3.11|3.12|3.13)` jobs. Earlier
exploratory commits on this branch tried caching
`target/scala-2.13/{classes,zinc,src_managed}` directly; that broke
`scalafix` because zinc skipped the compile that produces the SemanticDB
files scalafix needs, so the sbt-target cache was dropped.

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.7)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci changes related to CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Combine sbt CI steps to reduce JVM startup overhead

2 participants