Skip to content

fix(mcp): preserve auth errors + advertise upstream IdP for delegate-auth discovery#28550

Open
ryan-crabbe-berri wants to merge 3 commits into
litellm_mcp_path_parser_canonical_url_regressionfrom
litellm_mcp_delegate_auth_error_and_discovery
Open

fix(mcp): preserve auth errors + advertise upstream IdP for delegate-auth discovery#28550
ryan-crabbe-berri wants to merge 3 commits into
litellm_mcp_path_parser_canonical_url_regressionfrom
litellm_mcp_delegate_auth_error_and_discovery

Conversation

@ryan-crabbe-berri

@ryan-crabbe-berri ryan-crabbe-berri commented May 22, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Bug G — preserve auth status code and WWW-Authenticate on ProxyException. The MCP ASGI handlers caught all exceptions and collapsed them to 500 {"error":"MCP request failed"}, so a 401 from the auth stack lost its status and discovery headers.
  • Bug E — return the upstream IdP's metadata (issuer + authorization_endpoint + token_endpoint + registration_endpoint) for OAuth servers configured with delegate_auth_to_upstream=true. Previously the proxy returned its own URLs and the client followed them right back to LiteLLM, defeating the delegate flow.
  • Mirror Bug E to RFC 9728 protected-resource metadata so PRM authorization_servers advertises the upstream issuer; add the canonical /.well-known/oauth-protected-resource/mcp/<name>/mcp route that spec-conforming clients (mcp-inspector ≥ 0.21) actually probe.

Stacked on top of #28543. Once that merges, this PR auto-retargets to litellm_internal_staging.

Test plan

  • uv run pytest tests/test_litellm/proxy/_experimental/mcp_server/test_handler_proxy_exception_mapping.py — Bug G regression suite (6 tests covering 401/403 preservation, header passthrough, non-numeric code defaulting to 500, empty-headers normalization, and graceful 500 fallback for arbitrary exceptions)
  • uv run pytest tests/test_litellm/proxy/_experimental/mcp_server/test_discoverable_endpoints.py -k "delegate_auth or protected_resource" — 10 tests covering the helper, AS metadata, PRM, and the new canonical route
  • End-to-end against mcp.atlassian.com via mcp-inspector: 401 + WWW-Authenticate from /mcp/atlassian1/mcp triggers discovery → PRM resolves to Atlassian → AS metadata resolves to Atlassian → DCR succeeds → user consents at Atlassian → token exchange succeeds → 31 Atlassian tools list successfully

…ption

The ASGI MCP handlers caught ``HTTPException`` explicitly but let
``ProxyException`` (the rest-of-proxy convention raised by the auth
stack) fall through to the catch-all ``except Exception`` clause, which
collapsed every auth failure into a generic
``500 {"error":"MCP request failed","details":""}``. This stripped the
real status code (401/403) and any auth-bootstrap headers — most
importantly the ``WWW-Authenticate: Bearer authorization_uri=...`` hint
MCP OAuth 2.1 clients use to discover the upstream IdP. Most lethal in
delegate-auth mode, where the discovery hint is the only signal a client
gets to start its PKCE flow.

Add a ``_proxy_exception_to_http_exception`` helper that re-shapes the
ProxyException into an HTTPException, preserving status code, message
and headers. Wire it into both ``handle_streamable_http_mcp`` and
``handle_sse_mcp`` ahead of the catch-all clause. The unexpected-error
path (true 500s) is unchanged.
…h discovery

When an MCP server is configured with delegate_auth_to_upstream=True, the
proxy is a transparent network relay — the end-user authenticates directly
with the upstream IdP via PKCE in their MCP client. The
/.well-known/oauth-authorization-server response was unconditionally
advertising LiteLLM's own /authorize and /token routes as the issuer, so
spec-conforming MCP clients (OpenWebUI, Cursor, mcp-inspector) would
attempt to drive the flow against endpoints that don't speak OAuth and
either dead-end or mint tokens tied to the wrong issuer.

Re-advertise the upstream values resolved at registration time
(MCPServer.authorization_url / token_url / registration_url / scopes). Non-
delegate servers continue to receive the existing LiteLLM-issuer response,
since the proxy still mediates that flow.
…p PRM route

Mirrors the AS-metadata fix to RFC 9728 protected-resource metadata so
delegate-auth servers stay coherent end-to-end:

- ``_build_oauth_protected_resource_response`` reuses
  ``_delegate_auth_authorization_server_response`` to advertise the
  upstream IdP issuer in ``authorization_servers``. Without this, a
  client following PRM still ends up at LiteLLM's AS metadata.

- New route ``/.well-known/oauth-protected-resource/mcp/<name>/mcp``.
  Spec-conforming clients (mcp-inspector >= 0.21) construct PRM URLs by
  inserting the well-known segment between host and the streamable-http
  resource path. Without this route they 404 and fall back to root PRM.

Verified end-to-end against mcp.atlassian.com via mcp-inspector: OAuth
discovery resolves to Atlassian, DCR + PKCE complete, and 31 Atlassian
tools list successfully under delegate auth.
@codecov

codecov Bot commented May 22, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.07843% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
..._experimental/mcp_server/discoverable_endpoints.py 97.36% 1 Missing ⚠️
litellm/proxy/_experimental/mcp_server/server.py 92.30% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@greptile-apps

greptile-apps Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes two MCP OAuth discovery bugs: auth errors (401/403) were being collapsed to generic 500 responses in the ASGI handlers, and delegate-auth servers were advertising LiteLLM's own endpoints instead of the upstream IdP's metadata in both AS metadata and PRM responses. A new canonical /.well-known/oauth-protected-resource/mcp/{name}/mcp route is also added to satisfy RFC 9728 clients like mcp-inspector ≥ 0.21.

  • The ProxyExceptionHTTPException conversion in server.py is clean and well-tested; status code, detail, and WWW-Authenticate headers are all preserved correctly.
  • The delegate-auth issuer override in discoverable_endpoints.py correctly routes both AS metadata and PRM to the upstream IdP, with a safe fallback when upstream URLs are missing.
  • The canonical PRM route shares the use_standard_pattern=True code path, which sets resource = /mcp/{name} rather than /mcp/{name}/mcp; this could cause RFC 9728-validating clients to reject the response since resource should match the actual protected resource URL. The issuer derivation also discards path components, which would produce incorrect issuers for path-based IdPs such as Keycloak realms.

Confidence Score: 4/5

Safe to merge for mainstream IdPs (Auth0, Atlassian, Okta SaaS); the two edge-case concerns in discoverable_endpoints.py do not affect the end-to-end test scenario but could surface with path-based issuers or strict RFC 9728 clients.

The auth-error preservation fix in server.py is correct and thoroughly tested. The delegate-auth issuer override works well for the demonstrated Atlassian case. Two gaps remain in discoverable_endpoints.py: the canonical PRM route returns resource = /mcp/{name} while spec-conforming clients probing for the streamable-http resource at /mcp/{name}/mcp would expect the field to match; and the issuer is derived by stripping the path from the authorization endpoint, which silently produces the wrong issuer for path-based IdPs like Keycloak realms. Neither gap breaks the verified end-to-end scenario, but they could surface with different client or IdP configurations.

discoverable_endpoints.py — the canonical PRM route's resource field and the issuer derivation logic warrant a second look

Important Files Changed

Filename Overview
litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py Adds delegate-auth upstream-issuer override for AS metadata and PRM; adds canonical /.well-known/oauth-protected-resource/mcp/{name}/mcp route. The canonical route's resource field is derived with use_standard_pattern=True, which omits the trailing /mcp from the actual streamable-http endpoint URL, creating a potential RFC 9728 compliance gap. Issuer derivation from authorization_endpoint also drops path components, which breaks path-based IdPs like Keycloak realms.
litellm/proxy/_experimental/mcp_server/server.py Adds _proxy_exception_to_http_exception helper and inserts ProxyException catch clauses in both handle_streamable_http_mcp and handle_sse_mcp. The fix correctly preserves status code, detail message, and auth headers. Defensive int(exc.code) coercion handles the None string edge case. Logic looks correct.
tests/test_litellm/proxy/_experimental/mcp_server/test_handler_proxy_exception_mapping.py New test file covering ProxyException to HTTPException mapping with 6 targeted unit/integration tests; all mock-only, no real network calls. Good coverage of status preservation, header passthrough, non-numeric code defaulting, and graceful 500 fallback.
tests/test_litellm/proxy/_experimental/mcp_server/test_discoverable_endpoints.py Adds 10 new tests for delegate-auth discovery paths; correctly uses global_mcp_server_manager.registry manipulation and ASGI transports (no real network calls). The canonical-route test does not assert the resource field, leaving the URL-mismatch issue undetected.

Reviews (1): Last reviewed commit: "fix(mcp): point PRM at upstream issuer +..." | Re-trigger Greptile

@ryan-crabbe-berri ryan-crabbe-berri force-pushed the litellm_mcp_delegate_auth_error_and_discovery branch from 0c7f860 to bbd69b2 Compare May 22, 2026 04:23
@ryan-crabbe-berri ryan-crabbe-berri force-pushed the litellm_mcp_path_parser_canonical_url_regression branch from cff01fb to 01280db Compare May 22, 2026 04:23
Comment on lines +775 to +785
@router.get(
f"/.well-known/oauth-protected-resource{'' if get_server_root_path() == '/' else get_server_root_path()}/mcp/{{mcp_server_name}}/mcp"
)
async def oauth_protected_resource_mcp_canonical(
request: Request, mcp_server_name: str
):
return _build_oauth_protected_resource_response(
request=request,
mcp_server_name=mcp_server_name,
use_standard_pattern=True,
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Canonical PRM route returns wrong resource identifier

The canonical route handles discovery for the streamable-http endpoint at /mcp/{server_name}/mcp, but _build_oauth_protected_resource_response with use_standard_pattern=True sets resource = {base}/mcp/{server_name} (missing the trailing /mcp). RFC 9728 §2 states that resource MUST equal the URL clients use to access the protected resource; a client probing /.well-known/oauth-protected-resource/mcp/atlassian1/mcp implicitly declares that its resource URL is {base}/mcp/atlassian1/mcp. Any spec-conforming client that validates the resource field against the URL it is connecting to will see a mismatch and reject the metadata. The test for this route (test_canonical_protected_resource_route_returns_200_for_streamable_path) does not assert on body["resource"], so this gap is undetected. Consider adding a third resource-URL mode (e.g. use_canonical_pattern) that appends the extra /mcp suffix, or pass resource_url_override to the builder.

Comment on lines +862 to +866
# Issuer is the upstream authorization server's origin per RFC 8414.
parsed = urlparse(authorization_endpoint)
if not parsed.scheme or not parsed.netloc:
return None
issuer = f"{parsed.scheme}://{parsed.netloc}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Issuer derivation strips path for path-based issuers

issuer is constructed as scheme://netloc, dropping any path component of authorization_endpoint. For most popular IdPs (Auth0, Okta SaaS, Atlassian) the issuer really is the origin, so the derivation is correct. However, for multi-tenant or self-hosted deployments — Keycloak realms (https://keycloak.example.com/realms/myrealm/protocol/openid-connect/auth), Azure AD B2C tenants, or any RFC 8414 issuer that includes a path — the emitted issuer would be https://keycloak.example.com instead of https://keycloak.example.com/realms/myrealm. Clients that use this issuer to locate the upstream discovery document (via {issuer}/.well-known/oauth-authorization-server) would then 404. Storing the issuer explicitly at registration time (rather than re-deriving it each request) would be safer; alternatively, verify the expected issuer against the upstream discovery document response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant