fix(cache): handle ValueError from parsed.port on Windows file:// URLs#1446
Merged
danielmeppiel merged 2 commits intoMay 21, 2026
Merged
Conversation
On Windows, file://C:\path\bare.git parses with netloc 'C:\path...' and the drive-letter colon is interpreted as host:port. Accessing parsed.port raises ValueError, crashing cache_shard_key in any test that constructs file:// URLs from tmp_path. Catch ValueError and fall back to the raw lowercased netloc as the authority, preserving per-URL distinctness for cache shard keys. Adds regression tests for the Windows file:// case. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes Windows CI failures in the cache URL normalizer by preventing urllib.parse from raising ValueError when encountering malformed file://C:\... URLs (where the drive-letter colon is misinterpreted as a host:port separator).
Changes:
- Wrap
parsed.hostname/parsed.portextraction intry/except ValueErrorand fall back to a deterministic authority derived fromparsed.netloc. - Add unit regression tests to ensure Windows-style
file://C:\...inputs do not raise and produce distinct cache shard keys for distinct paths.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/cache/test_url_normalize.py | Adds regression tests for Windows-style malformed file:// URLs to ensure shard-key derivation doesn’t raise and remains distinct per path. |
| src/apm_cli/cache/url_normalize.py | Catches ValueError from urlparse port/hostname parsing and falls back to raw netloc to keep cache-key derivation deterministic on Windows. |
Copilot's findings
- Files reviewed: 2/2 changed files
- Comments generated: 1
Comment on lines
+83
to
+90
| # host:port. Fall back to the raw netloc (lowercased) so cache-key | ||
| # derivation stays deterministic and per-URL distinct without raising. | ||
| try: | ||
| hostname = (parsed.hostname or "").lower() | ||
| username = parsed.username or "" | ||
| port = parsed.port | ||
| except ValueError: | ||
| authority = parsed.netloc.lower() |
Collaborator
Author
There was a problem hiding this comment.
Good catch -- fixed in 2573439. The fallback now splits userinfo on @, drops everything after : in the user portion, then lowercases, mirroring Step 4 in the happy path. Added test_fallback_strips_password_from_netloc to lock it in.
Address Copilot review feedback on #1446. The except-ValueError fallback path previously did 'authority = parsed.netloc.lower()', which retained userinfo passwords (e.g. user:token@host:badport) and violated documented Step 4 (strip password). That would bake credentials into cache keys and risk leaking them via any caller that logs normalize_repo_url() output. Split on '@', drop everything after ':' in the userinfo, then lowercase -- mirroring the happy-path behaviour. Adds regression test test_fallback_strips_password_from_netloc. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Windows CI (run 26252988143) failed with:
at
src/apm_cli/cache/url_normalize.py:86(port = parsed.port).Root cause: On Windows,
urlparse("file://C:\\path\\bare.git")placesC:\path...into thenetloc. Python'sparsed.portproperty splits the netloc on:and interprets the drive-letter colon as a host:port separator -- raisingValueErrorbecause the 'port' isn't a digit. This crashedcache_shard_key()for every test that constructed afile://{tmp_path}URL (e.g.TestGetCheckoutLayout.test_full_variant_layout).Linux/macOS were unaffected because their tmp paths start with
/, producing an empty netloc.Fix
Wrap hostname/username/port extraction in
try/except ValueError. On failure, fall back to the raw lowercasednetlocas the authority. This keeps cache-key derivation:C:\repo_a.gitvsC:\repo_b.git) still get different shards (the prior trivial fix collapsed them all tofile://c)Tests
Added two regression tests in
tests/unit/cache/test_url_normalize.py:test_windows_file_url_does_not_raise-- verifies noValueErrortest_windows_file_urls_distinct_paths_distinct_keys-- verifies shard-key distinctnessValidation
uv run --extra dev pytest tests/unit/cache/ -q-> 44 passeduv run --extra dev ruff check src/ tests/-> All checks passeduv run --extra dev ruff format --check src/ tests/-> 1046 files already formatteduv run --extra dev python -m pylint --disable=all --enable=R0801 --min-similarity-lines=10 --fail-on=R0801 src/apm_cli/-> 10.00/10bash scripts/lint-auth-signals.sh-> auth-signal lint clean