Summary
The GH-AW compiler should never emit --filter=blob:none (blobless/partial clone) in the agent job. Blobless clones create a "promisor remote" that requires network credentials for any operation touching non-local blobs. Since agent jobs always use persist-credentials: false, the agent has no credentials — any git operation that needs a blob not already fetched will fail with a username prompt or "unable to read tree" error.
This is independent of whether fetch: refs are specified. The fundamental issue is that --filter=blob:none + persist-credentials: false is a broken combination.
Background: Clone Modes and Their Interactions
Git supports three orthogonal mechanisms for reducing clone size:
1. Shallow clone (--depth=N)
Downloads only the last N commits of history (commit + tree + blob objects for those commits). The repo is marked as "shallow" in .git/shallow.
- Effect on checkout: Checking out any commit within the shallow window works fully offline — all objects are local.
- Effect on
git log: Limited to the shallow depth.
- Network needed after clone: Only if you need history beyond the shallow window (e.g.
git fetch --deepen).
2. Sparse checkout (sparse-checkout)
Controls which paths appear in the working tree. All git objects are still fetched (or not, depending on other settings) — sparse-checkout only filters what gets written to disk as files.
- Effect on checkout: No impact on object availability.
git checkout <branch> works fully offline regardless of sparse-checkout settings. Sparse-checkout is purely a working-tree filter.
- Effect on blob availability: Without blobless, all blobs in the fetched commits are downloaded even for paths outside the sparse cone.
git show HEAD:path/outside/sparse works.
- Network needed after clone: Never (for git object operations).
3. Partial/blobless clone (--filter=blob:none)
Downloads only commit and tree objects. Blob objects (file contents) are omitted entirely. The remote is configured as a "promisor remote" in .git/config (remote.origin.partialclonefilter=blob:none). Git will lazily fetch missing blobs on demand from the promisor remote.
- Effect on checkout:
git checkout <branch> requires network access to fetch blobs for the working tree. If credentials are not available, checkout fails with a username prompt or "unable to read tree" error.
- Effect on other operations:
git show, git diff, git log -p, git cat-file blob — all trigger lazy fetches and fail without credentials.
- Network needed after clone: Yes — for any operation that materializes file contents.
Combinations and Offline Checkout Capability
| Shallow |
Sparse |
Blobless |
Agent can work offline? |
Notes |
| ✓ |
✗ |
✗ |
✅ Yes |
All blobs present for commits in window |
| ✓ |
✓ |
✗ |
✅ Yes |
Sparse only filters working tree; all blobs still downloaded |
| ✓ |
✗ |
✓ |
❌ No |
No blobs; any checkout/show/diff triggers lazy fetch |
| ✓ |
✓ |
✓ |
❌ No |
Current behavior — blobs missing, agent cannot checkout or read files |
| ✗ |
✓ |
✗ |
✅ Yes |
Full clone with sparse working tree |
| ✗ |
✗ |
✓ |
❌ No |
Blobless always needs network for file content |
Key insight: --filter=blob:none is the only setting that prevents offline git operations. Shallow and sparse are both safe for agents. The deciding factor is not whether fetch: refs are specified — it is whether the agent will ever need to access blob content without credentials, which is always true since persist-credentials: false is always set.
How actions/checkout@v6 Introduces Blobless
When sparse-checkout is specified with a non-zero fetch-depth, actions/checkout@v6 automatically adds --filter=blob:none as a bandwidth optimization. This is sensible for CI jobs that only need to build/test the checked-out code. But for agentic workflows where the agent performs arbitrary git operations, it creates a broken state.
The Problem
Two places in the compiled workflow emit --filter=blob:none:
actions/checkout step (implicit): When sparse-checkout + fetch-depth > 0, the action adds --filter=blob:none automatically.
- "Fetch additional refs" step (explicit, safe_outputs job only):
compiler_safe_outputs_steps.go line 312 explicitly adds --filter=blob:none with the correct comment "never checked out in the safe_outputs job, so their blob objects are unnecessary."
The agent job's "Fetch additional refs" step in checkout_step_generator.go does NOT add --filter=blob:none itself — but the damage is already done by actions/checkout, which configured the repo as a promisor remote. Even without the filter flag on the fetch command, the repo's config means git knows it can (and will try to) lazy-fetch missing objects.
Observed Impact (run 26687250061)
The agent spent 4+ minutes (15:18:27 → 15:22:06) trying 12+ different checkout approaches before the job timed out. It tried:
git fetch origin <branch> — prompted for username
git checkout --track origin/<branch> — needed blobs, prompted
git checkout -b <branch> FETCH_HEAD — failed
git checkout -b <branch> <SHA> — needed blobs, prompted
- Various path explorations (secondary issue — agent also didn't use the documented path)
- Eventually used
git worktree add --no-checkout + git commit --allow-empty as workaround
The "Fetch additional refs" step itself worked perfectly — the branch ref was available as origin/dsyme/ci-perf/... with the correct SHA. The ONLY problem was missing blob objects due to the promisor/partial-clone configuration.
Proposed Fix
In the agent job, never use blobless clones. The criterion is simple: since agent jobs always set persist-credentials: false, and blobless requires credentials for lazy fetches, blobless should never be used in agent jobs.
Implementation options:
Option A (explicit override in actions/checkout): Pass filter: "" (empty string) in the actions/checkout with: block for agent-job checkouts. This prevents actions/checkout from adding --filter=blob:none regardless of sparse-checkout settings.
# In agent job checkout
- uses: actions/checkout@v6
with:
sparse-checkout: |
test/
app/
...
fetch-depth: 1
filter: "" # <-- prevents blobless; agent needs offline blob access
persist-credentials: false
Option B (unset after checkout): After actions/checkout, add a step to disable the promisor remote config:
git config --unset remote.origin.partialclonefilter 2>/dev/null || true
git config --unset-all remote.origin.promisor 2>/dev/null || true
This is less clean (the initial checkout was still blobless, so blobs for the default branch sparse-checkout were fetched on-demand during checkout — but actions/checkout has credentials at that point so it works). Subsequent fetches would download full objects.
Option A is preferred — it ensures the repo is never configured as a partial clone, so all operations work offline from the start.
Size Impact
For github/github with a typical sparse-checkout pattern:
- With blobless: Initial checkout ~500 MB (sparse paths, trees + blobs only for sparse cone)
- Without blobless: Initial checkout ~1.3 GB (all blobs at HEAD within shallow depth, regardless of sparse cone)
- Extra cost: ~800 MB more download, ~8-15 seconds on a GH runner
- Savings: Eliminates agent failures and 4+ minutes of wasted time per checkout attempt
The tradeoff strongly favors removing blobless in agent jobs.
What Should NOT Change
- The safe_outputs job correctly uses
--filter=blob:none in its fetch step (line 312 of compiler_safe_outputs_steps.go) — this job never checks out branches or reads blobs, so the optimization is valid there.
- Sparse-checkout and shallow clone should remain — they are safe and provide meaningful size savings without breaking agent operations.
Related
Summary
The GH-AW compiler should never emit
--filter=blob:none(blobless/partial clone) in the agent job. Blobless clones create a "promisor remote" that requires network credentials for any operation touching non-local blobs. Since agent jobs always usepersist-credentials: false, the agent has no credentials — any git operation that needs a blob not already fetched will fail with a username prompt or "unable to read tree" error.This is independent of whether
fetch:refs are specified. The fundamental issue is that--filter=blob:none+persist-credentials: falseis a broken combination.Background: Clone Modes and Their Interactions
Git supports three orthogonal mechanisms for reducing clone size:
1. Shallow clone (
--depth=N)Downloads only the last N commits of history (commit + tree + blob objects for those commits). The repo is marked as "shallow" in
.git/shallow.git log: Limited to the shallow depth.git fetch --deepen).2. Sparse checkout (
sparse-checkout)Controls which paths appear in the working tree. All git objects are still fetched (or not, depending on other settings) — sparse-checkout only filters what gets written to disk as files.
git checkout <branch>works fully offline regardless of sparse-checkout settings. Sparse-checkout is purely a working-tree filter.git show HEAD:path/outside/sparseworks.3. Partial/blobless clone (
--filter=blob:none)Downloads only commit and tree objects. Blob objects (file contents) are omitted entirely. The remote is configured as a "promisor remote" in
.git/config(remote.origin.partialclonefilter=blob:none). Git will lazily fetch missing blobs on demand from the promisor remote.git checkout <branch>requires network access to fetch blobs for the working tree. If credentials are not available, checkout fails with a username prompt or "unable to read tree" error.git show,git diff,git log -p,git cat-file blob— all trigger lazy fetches and fail without credentials.Combinations and Offline Checkout Capability
Key insight:
--filter=blob:noneis the only setting that prevents offline git operations. Shallow and sparse are both safe for agents. The deciding factor is not whetherfetch:refs are specified — it is whether the agent will ever need to access blob content without credentials, which is always true sincepersist-credentials: falseis always set.How
actions/checkout@v6Introduces BloblessWhen
sparse-checkoutis specified with a non-zerofetch-depth,actions/checkout@v6automatically adds--filter=blob:noneas a bandwidth optimization. This is sensible for CI jobs that only need to build/test the checked-out code. But for agentic workflows where the agent performs arbitrary git operations, it creates a broken state.The Problem
Two places in the compiled workflow emit
--filter=blob:none:actions/checkoutstep (implicit): When sparse-checkout + fetch-depth > 0, the action adds--filter=blob:noneautomatically.compiler_safe_outputs_steps.goline 312 explicitly adds--filter=blob:nonewith the correct comment "never checked out in the safe_outputs job, so their blob objects are unnecessary."The agent job's "Fetch additional refs" step in
checkout_step_generator.godoes NOT add--filter=blob:noneitself — but the damage is already done byactions/checkout, which configured the repo as a promisor remote. Even without the filter flag on the fetch command, the repo's config means git knows it can (and will try to) lazy-fetch missing objects.Observed Impact (run 26687250061)
The agent spent 4+ minutes (15:18:27 → 15:22:06) trying 12+ different checkout approaches before the job timed out. It tried:
git fetch origin <branch>— prompted for usernamegit checkout --track origin/<branch>— needed blobs, promptedgit checkout -b <branch> FETCH_HEAD— failedgit checkout -b <branch> <SHA>— needed blobs, promptedgit worktree add --no-checkout+git commit --allow-emptyas workaroundThe "Fetch additional refs" step itself worked perfectly — the branch ref was available as
origin/dsyme/ci-perf/...with the correct SHA. The ONLY problem was missing blob objects due to the promisor/partial-clone configuration.Proposed Fix
In the agent job, never use blobless clones. The criterion is simple: since agent jobs always set
persist-credentials: false, and blobless requires credentials for lazy fetches, blobless should never be used in agent jobs.Implementation options:
Option A (explicit override in actions/checkout): Pass
filter: ""(empty string) in theactions/checkoutwith:block for agent-job checkouts. This prevents actions/checkout from adding--filter=blob:noneregardless of sparse-checkout settings.Option B (unset after checkout): After
actions/checkout, add a step to disable the promisor remote config:This is less clean (the initial checkout was still blobless, so blobs for the default branch sparse-checkout were fetched on-demand during checkout — but actions/checkout has credentials at that point so it works). Subsequent fetches would download full objects.
Option A is preferred — it ensures the repo is never configured as a partial clone, so all operations work offline from the start.
Size Impact
For github/github with a typical sparse-checkout pattern:
The tradeoff strongly favors removing blobless in agent jobs.
What Should NOT Change
--filter=blob:nonein its fetch step (line 312 ofcompiler_safe_outputs_steps.go) — this job never checks out branches or reads blobs, so the optimization is valid there.Related
on.needsemission bug)pkg/workflow/checkout_step_generator.go(agent job fetch step generation)pkg/workflow/compiler_safe_outputs_steps.go(safe_outputs job fetch step — correct, do not change)