-
Notifications
You must be signed in to change notification settings - Fork 137
Add CI build caching and improve benchmark workflow #1148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 2 commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
3738e14
Add CI build caching for GitHub-hosted and self-hosted HPC runners
sbryngelson c3a2469
Fix race conditions and cleanup in build cache
sbryngelson 79ce4e7
Fix stale retry log messages
sbryngelson bb7f705
Disable git clean on self-hosted runners to preserve build cache
sbryngelson 08490be
Skip build cache for benchmarks and fix benchmark trigger logic
sbryngelson 4172553
Fix cross-runner cache by updating install/ config paths
sbryngelson 233aa11
Delete install/ on workspace path change to fix stale binaries
sbryngelson 2406c64
Simplify build cache to per-runner directories
sbryngelson d4e7c8f
Remove restore-keys prefix fallback from GH-hosted build cache
sbryngelson 84835be
Make benchmark pipeline robust to transient GPU failures
sbryngelson 09920e7
Suppress pylint too-many-nested-blocks for bench()
sbryngelson 1adc121
Fix rm -rf following build symlink into shared cache
sbryngelson f7d0164
Detect stale cached binaries and include install/ in retry cleanup
sbryngelson 5c97194
Fix benchmark PR detection for cross-fork workflow_run events
sbryngelson 27f897b
Move build cache from scratch to coda1 project storage
sbryngelson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,89 @@ | ||
| #!/bin/bash | ||
|
sbryngelson marked this conversation as resolved.
|
||
| # Sets up a persistent build cache for self-hosted CI runners. | ||
| # Creates a symlink: ./build -> <resolved scratch path>/.mfc-ci-cache/<key>/build | ||
| # | ||
| # This ensures that every run of the same config (cluster/device/interface) finds | ||
| # cached build artifacts regardless of which runner instance picks up the job. | ||
| # | ||
| # Concurrent safety: uses flock to serialize access per cache directory. If | ||
| # multiple PRs trigger the same config simultaneously, the second job waits | ||
| # for the first to finish (up to 1 hour), then gets a warm cache. If the lock | ||
| # times out, falls back to a local build (same as no caching). | ||
| # | ||
| # Usage: source .github/scripts/setup-build-cache.sh <cluster> <device> <interface> | ||
|
|
||
| _cache_cluster="${1:?Usage: setup-build-cache.sh <cluster> <device> <interface>}" | ||
| _cache_device="${2:?}" | ||
| _cache_interface="${3:-none}" | ||
|
|
||
| _cache_key="${_cache_cluster}-${_cache_device}-${_cache_interface}" | ||
| _cache_base="$HOME/scratch/.mfc-ci-cache/${_cache_key}/build" | ||
|
|
||
| # Create the cache dir, then resolve to a physical path (no symlinks). | ||
| # $HOME/scratch is typically a symlink to a scratch filesystem — resolving | ||
| # it ensures the build symlink target remains valid even if intermediate | ||
| # symlinks change. | ||
| mkdir -p "$_cache_base" | ||
| _cache_dir="$(cd "$_cache_base" && pwd -P)" | ||
|
sbryngelson marked this conversation as resolved.
Outdated
|
||
|
|
||
| echo "=== Build Cache Setup ===" | ||
| echo " Cache key: $_cache_key" | ||
| echo " Cache dir: $_cache_dir" | ||
|
|
||
| # Acquire an exclusive lock on the cache directory to prevent concurrent | ||
| # builds from corrupting it. The lock is fd-based (flock on fd 9), so it | ||
| # auto-releases when the calling process exits — no stale locks. | ||
| # | ||
| # Timeout: 1 hour. If another build holds the lock, we wait. This is fine | ||
| # because the waiting job will get a warm cache when it finally acquires. | ||
| # If the lock can't be acquired after 1 hour, something is wrong — fall | ||
| # back to a local build in the workspace. | ||
| _cache_locked=false | ||
|
sbryngelson marked this conversation as resolved.
Outdated
|
||
| _lock_file="$_cache_dir/.cache.lock" | ||
|
sbryngelson marked this conversation as resolved.
Outdated
|
||
| exec 9>"$_lock_file" | ||
| echo " Acquiring cache lock..." | ||
| if flock --timeout 3600 9; then | ||
| _cache_locked=true | ||
| echo " Cache lock acquired" | ||
| else | ||
| echo " WARNING: Cache lock timeout (1h), building locally without cache" | ||
| exec 9>&- | ||
| # Remove any existing symlink to the shared cache so we don't write | ||
| # into it without the lock. Then create a real local directory. | ||
| if [ -L "build" ]; then | ||
| rm -f "build" | ||
| fi | ||
| mkdir -p "build" | ||
|
cubic-dev-ai[bot] marked this conversation as resolved.
Outdated
|
||
| echo "=========================" | ||
| return 0 2>/dev/null || true | ||
|
sbryngelson marked this conversation as resolved.
Outdated
cubic-dev-ai[bot] marked this conversation as resolved.
Outdated
|
||
| fi | ||
|
sbryngelson marked this conversation as resolved.
Outdated
|
||
|
|
||
| # If build/ exists (real dir or stale symlink), remove it. | ||
| # rm -rf on a symlink removes the symlink, not the target — cache is safe. | ||
| if [ -e "build" ] || [ -L "build" ]; then | ||
| rm -rf "build" | ||
| fi | ||
|
sbryngelson marked this conversation as resolved.
Outdated
|
||
|
|
||
| ln -s "$_cache_dir" "build" | ||
|
sbryngelson marked this conversation as resolved.
sbryngelson marked this conversation as resolved.
sbryngelson marked this conversation as resolved.
sbryngelson marked this conversation as resolved.
Outdated
sbryngelson marked this conversation as resolved.
|
||
|
|
||
| # Handle cross-runner workspace path changes. | ||
| # CMakeCache.txt stores absolute paths from whichever runner instance | ||
| # originally configured the build. If we're on a different runner, sed-replace | ||
| # the old workspace path with the current one so CMake can do incremental builds. | ||
| _workspace_marker="$_cache_dir/.workspace_path" | ||
| if [ -f "$_workspace_marker" ]; then | ||
| _old_workspace=$(cat "$_workspace_marker") | ||
| if [ "$_old_workspace" != "$(pwd)" ]; then | ||
| echo " Workspace path changed: $_old_workspace -> $(pwd)" | ||
| echo " Updating cached CMake paths..." | ||
|
sbryngelson marked this conversation as resolved.
Outdated
|
||
| find "$_cache_dir/staging" -type f \ | ||
| \( -name "CMakeCache.txt" -o -name "*.cmake" \ | ||
| -o -name "*.make" -o -name "Makefile" \ | ||
| -o -name "build.ninja" \) \ | ||
| -exec sed -i "s|${_old_workspace}|$(pwd)|g" {} + 2>/dev/null || true | ||
|
sbryngelson marked this conversation as resolved.
Outdated
sbryngelson marked this conversation as resolved.
Outdated
sbryngelson marked this conversation as resolved.
Outdated
sbryngelson marked this conversation as resolved.
Outdated
sbryngelson marked this conversation as resolved.
Outdated
|
||
| fi | ||
|
sbryngelson marked this conversation as resolved.
Outdated
|
||
| fi | ||
|
sbryngelson marked this conversation as resolved.
Outdated
sbryngelson marked this conversation as resolved.
Outdated
|
||
| echo "$(pwd)" > "$_workspace_marker" | ||
|
|
||
| echo " Symlink: build -> $_cache_dir" | ||
| echo "=========================" | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.