feat: import workflow or JSON workflow from arbitrary HTTP(S) URL#33164
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
- Add RawURL field to WorkflowSpec for generic HTTP(S) URL specs - parseWorkflowSpec now recognises non-GitHub URLs and returns a RawURL spec - FetchWorkflowFromSourceWithContext dispatches on RawURL to fetchGenericURLWorkflow - fetchGenericURLWorkflow fetches the URL and dispatches on Content-Type: - text/markdown / text/x-markdown → raw gh-aw workflow markdown - application/json (or +json) → JSON workflow converter - anything else → actionable error - New FetchImportURL helper: HEAD→GET fallback, size cap, host-scoped GitHub auth - New ConvertJSONWorkflowToMarkdown: best-effort JSON → markdown with warnings - add_command.go skips dependency fetch for generic URL imports - Both add and add-wizard help text updated with URL examples - Tests: import_url_fetcher_test.go, jsonworkflow_to_markdown_test.go Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
- Use errors.New(console.FormatErrorMessage(...)) instead of fmt.Errorf("%s", ...) for terminal errors
- Use fmt.Errorf("...: %w", err) for errors that wrap an underlying cause
- Rename parseErr → urlErr in parseWorkflowSpec to avoid shadowing
- Add comment explaining package-level nonAlphanumSeq regex initialisation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds support for importing workflows from arbitrary HTTP(S) URLs, including dispatching markdown directly and converting JSON workflow definitions to markdown.
Changes:
- Adds generic URL parsing/fetching and content-type dispatch for markdown/JSON imports.
- Adds JSON workflow conversion with preserved unknown fields and tests.
- Updates
add/add-wizardhelp text and skips dependency fetching for generic URL imports.
Show a summary per file
| File | Description |
|---|---|
pkg/cli/spec.go |
Adds RawURL workflow specs and generic URL-derived names. |
pkg/cli/jsonworkflow_to_markdown.go |
Converts JSON workflow definitions into gh-aw markdown. |
pkg/cli/jsonworkflow_to_markdown_test.go |
Tests JSON conversion and URL-derived workflow names. |
pkg/cli/import_url_fetcher.go |
Adds HEAD/GET URL fetcher, content-type canonicalization, size cap, and scoped auth header handling. |
pkg/cli/import_url_fetcher_test.go |
Tests content types, fetch errors, size limits, HEAD fallback, and auth header behavior. |
pkg/cli/fetch.go |
Dispatches generic URL content to markdown pass-through or JSON conversion. |
pkg/cli/add_wizard_command.go |
Documents arbitrary URL and JSON import examples for guided add. |
pkg/cli/add_command.go |
Documents arbitrary URL imports and skips dependency fetching for generic URLs. |
.github/workflows/pr-sous-chef.lock.yml |
Updates gh-aw firewall container image versions in workflow metadata. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 9/9 changed files
- Comments generated: 7
| client := opts.HTTPClient | ||
| if client == nil { | ||
| client = &http.Client{} |
| for _, w := range generated.Warnings { | ||
| fmt.Fprintln(os.Stderr, console.FormatWarningMessage(fmt.Sprintf("JSON workflow import: %s", w))) |
| if seg == "" { | ||
| continue | ||
| } | ||
| return normalizeWorkflowID(seg) |
| if err != nil { | ||
| importURLFetcherLog.Printf("HEAD request failed (will fallback to GET): %v", err) | ||
| return "", false |
| resp, err := client.Do(req) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to fetch URL: %w", err) |
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /tdd, /zoom-out, and /grill-with-docs — this is a new feature that introduces new abstractions and a custom YAML serialisation path.
Key Themes
-
YAML serialisation correctness —
marshalFrontmatterValueusesjson.MarshalIndentas a YAML stand-in, producing double-quoted JSON keys in frontmatter output and silently mishandling YAML-specific types. The project already depends ongoccy/go-yaml; it should be used here instead. -
Incomplete YAML quoting —
yamlQuoteStringdoesn't quote YAML type-collision values liketrue,false,null, or numeric strings, which would cause silent data corruption on round-trip. -
Spec mutation as a side-effect —
fetchGenericURLWorkflowmutatesspec.WorkflowName, which violates the fetch layer's expected read-only contract. The generated name should be surfaced viaFetchedWorkflow. -
Error values contain ANSI codes —
errors.New(console.FormatErrorMessage(...))bakes terminal formatting into the error value. This breaks wrapping, logging, and programmatic inspection. -
Test coverage gaps —
fetchGenericURLWorkflow's dispatch logic has no unit tests; 403 and 5xx HTTP status codes are untested. -
HEAD + GET always both fire — The HEAD request is described as an optimisation to avoid downloading the body, but the body is always fetched via GET immediately after. The design needs clarification or simplification.
Positive Highlights
- ✅ Excellent security model: token is only forwarded to GitHub hosts, with a clean allow-list via
GH_HOST - ✅ Unknown JSON fields are preserved as YAML comments rather than silently dropped — great UX decision
- ✅ Good test coverage for
FetchImportURLandConvertJSONWorkflowToMarkdownhappy paths - ✅ 10 MB body cap and the LimitReader guard are exactly right
- ✅ Clean separation of concerns across the new files
Verdict
The YAML serialisation issues (comments 1 and 3) are correctness bugs that will produce malformed or semantically incorrect frontmatter for real-world inputs. Requesting changes on those before merge; the other comments are improvements.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · ● 10.9M
🧪 Test Quality Sentinel ReportTest Quality Score: 90/100✅ Excellent test quality
Test Classification DetailsView all 23 test classifications
Flagged Tests — Suggestions for ImprovementNo tests are flagged as implementation tests or coding-guideline violations. The following is a minor style suggestion only. i️ Missing assertion messages (non-blocking)Most assertions in both files omit the descriptive message argument required by the project guidelines. For example: // Current (no message)
assert.Equal(t, "text/markdown", res.ContentType)
// Preferred
assert.Equal(t, "text/markdown", res.ContentType, "Content-Type should be normalised from HEAD response")Affected functions include Language SupportTests analyzed:
Scoring Breakdown
Verdict
📖 Understanding Test ClassificationsDesign Tests (High Value) verify what the system does:
Implementation Tests (Low Value) verify how the system does it:
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.
|
There was a problem hiding this comment.
✅ Test Quality Sentinel: 91/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). Both files have the required build tags, no mock libraries, and strong behavioral coverage across HTTP fetch and JSON-to-markdown conversion logic.
|
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…String Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Done — all review comments addressed in the latest commits:
|
|
@copilot do a security review on gating the use of tokens for github /ghe hosts. Add tests. |
|
@copilot format, lint go, lint js |
… tests Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Security review done and one real vulnerability fixed. Commit: the latest. Issue found and fixed — plaintext HTTP token leak Other hardening in the same commit
New security boundary tests added
|
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Done in the latest commit. Go (
JS (Prettier)
|
|
@copilot add check/test that the URL domain must match GH_HOST to send the github token in tbe request |
|
@copilot custom URLs are only accepted for github.com or if GH_HOST matches the url domain |
…s: current, ignore-if-missing, and workflow_run.workflows validation Documents user-facing changes from PRs merged in the last 24 hours: - gh aw add / add-wizard now accept arbitrary HTTP(S) URLs and JSON workflow definitions (from #33164) - network.allowed-input opt-in input for caller-extensible reusable workflow allowlists (from #33200) - tools.github.allowed-repos: current macro for repo-scoped MCP guard policies (from #33041) - safe-outputs.github-app.ignore-if-missing flag for graceful App-token skip on fork/missing-secret runs (from #33033) - workflow_run.workflows now required at compile time (from #33191) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
addandadd-wizardonly accepted GitHub-hosted specs. Neither command could fetch from an arbitrary URL, and neither could import a JSON workflow definition.Changes
Spec routing (
spec.go)WorkflowSpecgains aRawURL stringfield for non-GitHub URL specsparseWorkflowSpecnow triesparseGitHubURLfirst forhttp(s)://inputs; if the host is not a recognised GitHub host, returns a generic URL spec (RawURLset, all GitHub fields empty)WorkflowSpec.String()returnsRawURLwhen setgenericURLWorkflowNamederives a kebab-cased workflow name from the URL pathURL fetcher (
import_url_fetcher.go)FetchImportURL—HEAD-first withGETfallback on 405/501/missingContent-Type; 10 MB body cap; canonicalises media typeattachImportAuthHeader— sendsAuthorization: Bearer $GH_TOKEN(falling back to$GITHUB_TOKEN) only when host matchesgithub.comor$GH_HOST; no token ever reaches third-party serversJSON workflow converter (
jsonworkflow_to_markdown.go)JSONWorkflowmapsid,name,description,instructions,engine,on,tags; unknown top-level keys captured inExtravia customUnmarshalJSONConvertJSONWorkflowToMarkdown→GeneratedWorkflow{Filename, Markdown, Warnings}; unknown fields serialised as YAML comment blocks so nothing is silently droppedContent-type dispatch (
fetch.go)fetchGenericURLWorkflowdispatches on canonicalisedContent-Type:text/markdown,text/x-markdownapplication/json,*+jsonConvertJSONWorkflowToMarkdownadd_command.goGeneric URL imports skip remote/local dependency fetching — no GitHub repo context is available for include/dispatch-workflow resolution.