Skip to content

Rework auth#1

Merged
ScriptSmith merged 4 commits intomainfrom
rework-auth
Mar 2, 2026
Merged

Rework auth#1
ScriptSmith merged 4 commits intomainfrom
rework-auth

Conversation

@ScriptSmith
Copy link
Owner

Re-work the authentication to combine gateway and control plane endpoints, reduce options to 4:

  • "none" - No authentication at all. For single-user local machine use. API keys can still be used for attributing cost etc, but are not required.
  • "api_key" - Require a standard API key using one of the supported headers.
  • "idp" - Require a session cookie obtained through the login flow, or a Bearer token. API keys can also be used, but are not required.
  • "iap" - Use headers from the reverse proxy to authenticate. API keys can also be used, but are not required.
    idisyncracies

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reworks Hadrian’s authentication configuration and runtime logic to use a single global [auth.mode] across gateway + admin endpoints, removing legacy “gateway vs admin” auth split and pushing JWT/SSO validation toward per-organization registries.

Changes:

  • Introduces unified auth mode config (none, api_key, idp, iap) and updates runtime auth plumbing accordingly (removing global OIDC authenticator and global JWT validator).
  • Extends SSRF URL validation with options to allow private IP ranges (while always blocking cloud metadata), and threads these options through OIDC/JWKS discovery paths.
  • Updates UI OpenAPI/docs/deploy examples and lockfile overrides to align with the new auth model and dependency security updates.

Reviewed changes

Copilot reviewed 51 out of 52 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
ui/src/api/openapi.json Updates embedded API documentation/config examples for new [auth.mode] layout.
ui/pnpm-lock.yaml Updates dependency overrides and lock entries (minimatch, serialize-javascript, etc.).
ui/package.json Updates dependency overrides matching lockfile changes.
src/wizard.rs Updates wizard output/config generation to emit [auth.mode] + related sections.
src/validation/url.rs Adds UrlValidationOptions and validate_base_url_opts, expands SSRF rules + tests.
src/validation/mod.rs Re-exports new URL validation API.
src/routes/ws.rs Migrates WS auth to new auth config accessors and shared session validation.
src/routes/execution.rs Updates test AppState initialization for removed fields.
src/routes/auth.rs Removes global OIDC fallback; standardizes session config lookup and shared session store usage.
src/routes/api.rs Updates comments to reference auth.mode.
src/routes/admin/ui_config.rs Derives UI auth methods from AuthMode.
src/routes/admin/sso_connections.rs Reports SSO connection info based on AuthMode (per-org SSO now).
src/routes/admin/sessions.rs Removes global OIDC authenticator fallback for session store lookup.
src/routes/admin/session_info.rs Reports session/auth mode info based on AuthMode.
src/routes/admin/org_sso_configs.rs Uses new SSRF validation options and threads allow_private_urls into validation/registry updates.
src/routes/admin/mod.rs Updates inline config examples to [auth.mode]/[auth.api_key]/[auth.session].
src/routes/admin/me_api_keys.rs Uses api_key_config() for key generation prefix selection.
src/routes/admin/api_keys.rs Uses api_key_config() for key generation prefix selection.
src/openapi.rs Updates OpenAPI doc snippets for new auth config structure.
src/middleware/combined.rs Refactors header auth flow to key off AuthMode; updates tests.
src/middleware/authz.rs Updates documentation wording around “auth not configured”.
src/middleware/admin.rs Switches admin auth to per-org registry/session store and updates SSRF options threading.
src/main.rs Removes global OIDC/JWT validator from AppState; updates route wiring and startup logging.
src/config/server.rs Adds allow_private_urls server option with docs/default.
src/config/mod.rs Updates validation rules and feature checks for new auth mode + IAP safety checks.
src/config/auth.rs Introduces AuthMode, IapConfig, and unified auth config accessors/validation/tests.
src/auth/session_store.rs Updates config path references in session model docs.
src/auth/registry.rs Loads only OIDC configs in OIDC registry initialization.
src/auth/oidc.rs Threads allow_private through discovery/JWKS lookup.
src/auth/gateway_jwt.rs Threads allow_private through per-org JWT registry building/lookup.
src/auth/discovery.rs Uses validate_base_url_opts and threads allow_private for SSRF validation.
docs/content/docs/troubleshooting.mdx Updates auth config examples to [auth.mode]/[auth.api_key].
docs/content/docs/security/index.mdx Renames proxy auth to IAP and aligns examples with new mode model.
docs/content/docs/features/sso-admin-guide.mdx Updates IdP/JWT requirements text to auth.mode = "idp".
docs/content/docs/features/multi-tenancy.mdx Updates session config paths in docs.
docs/content/docs/configuration/auth.mdx Major rewrite: documents unified auth modes, new sections, and removes legacy split.
docs/content/docs/authentication.mdx Major rewrite: describes single [auth.mode] approach and updated scenarios.
docs/content/docs/api/authentication.mdx Updates API auth doc to reflect per-org JWT routing via idp mode.
deploy/config/hadrian.university.toml Updates example config to new auth mode + SSRF allow_private/allow_loopback notes.
deploy/config/hadrian.traefik.toml Updates auth config sections to new layout.
deploy/config/hadrian.sqlite.toml Updates auth config sections to new layout.
deploy/config/hadrian.sqlite-redis.toml Updates auth config sections to new layout.
deploy/config/hadrian.saml.toml Updates example config to idp mode and SSRF allow_private for Docker setups.
deploy/config/hadrian.redis-cluster.toml Updates auth config sections to new layout.
deploy/config/hadrian.provider-health.toml Updates auth config sections to new layout.
deploy/config/hadrian.production.toml Updates auth config sections to new layout.
deploy/config/hadrian.postgres.toml Updates auth config sections and IAP example to new layout.
deploy/config/hadrian.postgres-ha.toml Updates auth config sections to new layout.
deploy/config/hadrian.observability.toml Updates auth config sections to new layout.
deploy/config/hadrian.keycloak.toml Updates example config to idp mode + allow_private/allow_loopback + new sections.
deploy/config/hadrian.dlq.toml Updates auth config sections to new layout.
Dockerfile Forces fresh main-crate rebuild by removing cached artifacts in the build stage.
Files not reviewed (1)
  • ui/pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +68 to +80
/// Whether admin routes should be protected by authentication middleware.
///
/// Returns true for modes that have an admin auth mechanism (Idp uses sessions/bearer
/// tokens, Iap uses proxy headers). Returns false for `ApiKey` mode, where only gateway
/// (API) routes require keys and admin routes are unprotected — matching the legacy
/// behavior where `[auth.gateway]` could be set without `[auth.admin]`.
pub fn requires_admin_auth(&self) -> bool {
match self.mode {
AuthMode::None | AuthMode::ApiKey => false,
#[cfg(feature = "sso")]
AuthMode::Idp => true,
AuthMode::Iap(_) => true,
}
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AuthMode::ApiKey is documented as “API key required everywhere”, but requires_admin_auth() returns false for ApiKey, which causes admin routes to be mounted without admin_auth_middleware and logged as “UNPROTECTED”. Either update the docs/comments to match the intended behavior, or change requires_admin_auth() (and admin auth implementation) so api_key mode actually protects /admin/* as well.

Copilot uses AI. Check for mistakes.
Comment on lines 2118 to 2141
@@ -2164,7 +2137,7 @@ pub fn build_app(config: &config::GatewayConfig, state: AppState) -> Router {
app = app.merge(Router::new().nest("/admin", admin_routes));
} else {
tracing::warn!(
"Admin routes are UNPROTECTED - configure auth.admin for Zero Trust or OIDC authentication"
"Admin routes are UNPROTECTED - configure auth.mode type = \"idp\" or \"iap\" for authentication"
);
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_app treats api_key mode as “admin routes unprotected” because it gates protection on config.auth.requires_admin_auth(). This conflicts with the new auth-mode docs/UI behavior that suggest api_key mode should still require credentials to access admin/control-plane endpoints. If api_key is meant to protect admin routes, this should mount protected routes + middleware for ApiKey too (or otherwise enforce auth on /admin/*).

Copilot uses AI. Check for mistakes.
Comment on lines +1056 to +1093
@@ -1073,8 +1072,25 @@ async fn try_oidc_session_auth(
.parse()
.map_err(|_| AuthError::InvalidToken)?;

// Get session from the OIDC authenticator's session store
let session = authenticator.get_session(session_id).await?;
// Get session from the registry's shared session store
let session = match registry.session_store().get_session(session_id).await {
Ok(Some(s)) => s,
Ok(None) => return Ok(None),
Err(e) => {
tracing::warn!(
session_id = %session_id,
error = %e,
"Failed to retrieve OIDC session"
);
return Ok(None);
}
};

// Check if session has expired
if session.is_expired() {
let _ = registry.session_store().delete_session(session_id).await;
return Ok(None);
}
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try_oidc_session_auth now fetches the session directly and only checks absolute expiration. This bypasses the shared validate_and_refresh_session logic (inactivity timeout, last_activity refresh, etc.), so enhanced sessions may not behave correctly and idle sessions may remain valid longer than intended. Consider switching to validate_and_refresh_session(registry.session_store().as_ref(), session_id, &session_config.enhanced) and mapping its errors as needed.

Copilot uses AI. Check for mistakes.
"info": {
"title": "Hadrian Gateway API",
"description": "**Hadrian Gateway** is an AI Gateway providing a unified OpenAI-compatible API for routing requests to multiple LLM providers.\n\n## Overview\n\nThe gateway provides two main API surfaces:\n\n- **Public API** (`/api/v1/*`) - OpenAI-compatible endpoints for LLM inference. Use these endpoints to create chat completions, text completions, embeddings, and list available models. Requires API key authentication.\n\n- **Admin API** (`/admin/v1/*`) - RESTful management endpoints for multi-tenant configuration. Manage organizations, projects, users, API keys, dynamic providers, usage tracking, and model pricing.\n\n## Authentication\n\nThe gateway supports multiple authentication methods for API access.\n\n### API Key Authentication\n\nAPI keys are the primary authentication method for programmatic access. Keys are created via the Admin API and scoped to organizations, projects, or users.\n\n**Using the Authorization header (recommended):**\n```\nAuthorization: Bearer gw_live_abc123def456...\n```\n\n**Using the X-API-Key header:**\n```\nX-API-Key: gw_live_abc123def456...\n```\n\nBoth headers are supported. The `Authorization: Bearer` format is recommended for compatibility with OpenAI client libraries.\n\n**Example request:**\n```bash\ncurl https://gateway.example.com/api/v1/chat/completions \\\n -H \\\"Authorization: Bearer gw_live_abc123def456...\\\" \\\n -H \\\"Content-Type: application/json\\\" \\\n -d '{\\\"model\\\": \\\"openai/gpt-4\\\", \\\"messages\\\": [{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Hello\\\"}]}'\n```\n\n### JWT Authentication\n\nWhen JWT authentication is enabled, requests can be authenticated using a JWT token from your identity provider.\n\n```\nAuthorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...\n```\n\nThe gateway validates the JWT against the configured JWKS endpoint and extracts the identity from the token claims.\n\n**Example request:**\n```bash\ncurl https://gateway.example.com/api/v1/chat/completions \\\n -H \\\"Authorization: Bearer eyJhbGciOiJSUzI1NiIs...\\\" \\\n -H \\\"Content-Type: application/json\\\" \\\n -d '{\\\"model\\\": \\\"openai/gpt-4\\\", \\\"messages\\\": [{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Hello\\\"}]}'\n```\n\n### Multi-Auth Mode\n\nWhen configured for multi-auth, the gateway accepts both API keys and JWTs using **format-based detection**:\n\n- **X-API-Key header**: Always validated as an API key\n- **Authorization: Bearer header**: Uses format-based detection:\n - Tokens starting with the configured API key prefix (default: `gw_`) are validated as API keys\n - All other tokens are validated as JWTs\n\n**Important:** Providing both `X-API-Key` and `Authorization` headers simultaneously results in a 400 error (ambiguous credentials). Choose one authentication method per request.\n\n**Examples:**\n```bash\n# API key in X-API-Key header\ncurl -H \\\"X-API-Key: gw_live_abc123...\\\" https://gateway.example.com/v1/chat/completions\n\n# API key in Authorization: Bearer header (format-based detection)\ncurl -H \\\"Authorization: Bearer gw_live_abc123...\\\" https://gateway.example.com/v1/chat/completions\n\n# JWT in Authorization: Bearer header\ncurl -H \\\"Authorization: Bearer eyJhbGciOiJSUzI1NiIs...\\\" https://gateway.example.com/v1/chat/completions\n```\n\n### Authentication Errors\n\n| Error Code | HTTP Status | Description | Example Response |\n|------------|-------------|-------------|------------------|\n| `unauthorized` | 401 | No authentication credentials provided | `{\\\"error\\\": {\\\"code\\\": \\\"unauthorized\\\", \\\"message\\\": \\\"Authentication required\\\"}}` |\n| `ambiguous_credentials` | 400 | Both X-API-Key and Authorization headers provided | `{\\\"error\\\": {\\\"code\\\": \\\"ambiguous_credentials\\\", \\\"message\\\": \\\"Ambiguous credentials: provide either X-API-Key or Authorization header, not both\\\"}}` |\n| `invalid_api_key` | 401 | API key is invalid, malformed, or revoked | `{\\\"error\\\": {\\\"code\\\": \\\"invalid_api_key\\\", \\\"message\\\": \\\"Invalid API key\\\"}}` |\n| `not_authenticated` | 401 | JWT validation failed | `{\\\"error\\\": {\\\"code\\\": \\\"not_authenticated\\\", \\\"message\\\": \\\"Token validation failed\\\"}}` |\n| `forbidden` | 403 | Valid credentials but insufficient permissions | `{\\\"error\\\": {\\\"code\\\": \\\"forbidden\\\", \\\"message\\\": \\\"Insufficient permissions\\\"}}` |\n\n### Configuration Examples\n\n**API Key Authentication:**\n```toml\n[auth.gateway]\ntype = \\\"api_key\\\"\nheader_name = \\\"X-API-Key\\\" # Header to read API key from\nkey_prefix = \\\"gw_\\\" # Valid key prefix\ncache_ttl_secs = 60 # Cache key lookups for 60 seconds\n```\n\n**JWT Authentication:**\n```toml\n[auth.gateway]\ntype = \\\"jwt\\\"\nissuer = \\\"https://auth.example.com\\\"\naudience = \\\"gateway-api\\\"\njwks_url = \\\"https://auth.example.com/.well-known/jwks.json\\\"\nidentity_claim = \\\"sub\\\" # JWT claim for user identity\n```\n\n**Multi-Auth (both API key and JWT):**\n```toml\n[auth.gateway]\ntype = \\\"multi\\\"\n\n[auth.gateway.api_key]\nheader_name = \\\"X-API-Key\\\"\nkey_prefix = \\\"gw_\\\"\n\n[auth.gateway.jwt]\nissuer = \\\"https://auth.example.com\\\"\naudience = \\\"gateway-api\\\"\njwks_url = \\\"https://auth.example.com/.well-known/jwks.json\\\"\n```\n\n## Pagination\n\nAll Admin API list endpoints use **cursor-based pagination** for stable, performant navigation.\n\n**Query Parameters:**\n- `limit` (optional): Maximum records per page (default: 100, max: 1000)\n- `cursor` (optional): Opaque cursor from previous response's `next_cursor` or `prev_cursor`\n- `direction` (optional): `forward` (default) or `backward`\n\n**Response:**\n```json\n{\n \\\"data\\\": [...],\n \\\"pagination\\\": {\n \\\"limit\\\": 100,\n \\\"has_more\\\": true,\n \\\"next_cursor\\\": \\\"MTczMzU4MDgwMDAwMDphYmMxMjM0...\\\",\n \\\"prev_cursor\\\": null\n }\n}\n```\n\n## Model Routing\n\nModels can be addressed in several ways:\n\n- **Static routing**: `provider-name/model-name` routes to config-defined providers\n- **Dynamic routing**: `:org/{ORG}/{PROVIDER}/{MODEL}` routes to database-backed providers\n- **Default**: When no prefix is specified, routes to the default provider\n\n## Error Codes\n\nAll errors follow a consistent JSON format:\n\n```json\n{\n \\\"error\\\": {\n \\\"code\\\": \\\"error_code\\\",\n \\\"message\\\": \\\"Human-readable error message\\\",\n \\\"details\\\": { ... } // Optional additional context\n }\n}\n```\n\n### Authentication & Authorization Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `unauthorized` | 401 | Missing or invalid API key/token |\n| `invalid_api_key` | 401 | API key is invalid, expired, or revoked |\n| `forbidden` | 403 | Valid credentials but insufficient permissions |\n| `not_authenticated` | 401 | Authentication required for this operation |\n\n### Rate Limiting & Budget Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `rate_limit_exceeded` | 429 | Request rate limit exceeded. Check `Retry-After` header. |\n| `budget_exceeded` | 402 | Budget limit exceeded for the configured period. Details include `limit_cents`, `current_spend_cents`, and `period`. |\n| `cache_required` | 503 | Budget enforcement requires cache to be configured |\n\n### Request Validation Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `validation_error` | 400 | Request body validation failed |\n| `bad_request` | 400 | Malformed request |\n| `routing_error` | 400 | Model routing failed (invalid model string or provider not found) |\n| `not_found` | 404 | Requested resource not found |\n| `conflict` | 409 | Resource already exists or conflicts with existing state |\n\n### Provider & Gateway Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `provider_error` | 502 | Upstream LLM provider returned an error |\n| `request_failed` | 502 | Failed to communicate with upstream provider |\n| `circuit_breaker_open` | 503 | Provider circuit breaker is open due to repeated failures |\n| `response_read_error` | 500 | Failed to read provider response |\n| `response_builder` | 500 | Failed to build response from provider data |\n| `internal_error` | 500 | Internal server error |\n\n### Guardrails Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `guardrails_blocked` | 400 | Content blocked by guardrails policy. Response includes `violations` array. |\n| `guardrails_timeout` | 504 | Guardrails evaluation timed out |\n| `guardrails_provider_error` | 502 | Error communicating with guardrails provider |\n| `guardrails_auth_error` | 502 | Authentication failed with guardrails provider |\n| `guardrails_rate_limited` | 429 | Guardrails provider rate limit exceeded |\n| `guardrails_config_error` | 500 | Invalid guardrails configuration |\n| `guardrails_parse_error` | 400 | Failed to parse content for guardrails evaluation |\n\n### Admin API Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `database_required` | 503 | Database not configured (required for admin operations) |\n| `services_required` | 503 | Required services not initialized |\n| `not_configured` | 503 | Required feature or service not configured |\n| `database_error` | 500 | Database operation failed |\n\n## Rate Limiting\n\nThe gateway implements multiple layers of rate limiting to protect against abuse and ensure fair usage.\n\n### Rate Limit Types\n\n| Type | Scope | Default | Description |\n|------|-------|---------|-------------|\n| **Requests per minute** | API Key | 60 | Maximum requests per minute per API key |\n| **Requests per day** | API Key | Unlimited | Optional daily request limit per API key |\n| **Tokens per minute** | API Key | 100,000 | Maximum tokens processed per minute |\n| **Tokens per day** | API Key | Unlimited | Optional daily token limit |\n| **Concurrent requests** | API Key | 10 | Maximum simultaneous in-flight requests |\n| **IP requests per minute** | IP Address | 120 | Rate limit for unauthenticated requests |\n\n### Rate Limit Headers\n\nAll API responses include rate limit information in HTTP headers.\n\n#### Request Rate Limit Headers\n\n| Header | Description | Example |\n|--------|-------------|---------|\n| `X-RateLimit-Limit` | Maximum requests allowed in the current window | `60` |\n| `X-RateLimit-Remaining` | Requests remaining in the current window | `45` |\n| `X-RateLimit-Reset` | Seconds until the rate limit window resets | `42` |\n\n#### Token Rate Limit Headers\n\n| Header | Description | Example |\n|--------|-------------|---------|\n| `X-TokenRateLimit-Limit` | Maximum tokens allowed per minute | `100000` |\n| `X-TokenRateLimit-Remaining` | Tokens remaining in the current minute | `85000` |\n| `X-TokenRateLimit-Used` | Tokens used in the current minute | `15000` |\n| `X-TokenRateLimit-Day-Limit` | Maximum tokens allowed per day (if configured) | `1000000` |\n| `X-TokenRateLimit-Day-Remaining` | Tokens remaining today (if configured) | `950000` |\n\n#### Rate Limit Exceeded Response\n\nWhen a rate limit is exceeded, the API returns HTTP 429 with:\n\n```json\n{\n \\\"error\\\": {\n \\\"code\\\": \\\"rate_limit_exceeded\\\",\n \\\"message\\\": \\\"Rate limit exceeded: 60 requests per minute\\\",\n \\\"details\\\": {\n \\\"limit\\\": 60,\n \\\"window\\\": \\\"minute\\\",\n \\\"retry_after_secs\\\": 42\n }\n }\n}\n```\n\nThe `Retry-After` header indicates seconds to wait before retrying:\n\n```\nHTTP/1.1 429 Too Many Requests\nRetry-After: 42\nX-RateLimit-Limit: 60\nX-RateLimit-Remaining: 0\nX-RateLimit-Reset: 42\n```\n\n### IP-Based Rate Limiting\n\nUnauthenticated requests (requests without a valid API key) are rate limited by IP address. This protects public endpoints like `/health` from abuse.\n\n- **Default:** 120 requests per minute per IP\n- **Client IP Detection:** Respects `X-Forwarded-For` and `X-Real-IP` headers when trusted proxies are configured\n- **Configuration:** Can be disabled or adjusted via `limits.rate_limits.ip_rate_limits` in config\n\n### Rate Limit Configuration\n\nRate limits are configured hierarchically:\n\n1. **Global defaults** (in `hadrian.toml`):\n```toml\n[limits.rate_limits]\nrequests_per_minute = 60\ntokens_per_minute = 100000\nconcurrent_requests = 10\n\n[limits.rate_limits.ip_rate_limits]\nenabled = true\nrequests_per_minute = 120\n```\n\n2. **Per-API key** limits can override global defaults (when creating API keys via Admin API)\n\n### Best Practices\n\n- **Implement exponential backoff**: When receiving 429 responses, wait the `Retry-After` duration before retrying\n- **Monitor rate limit headers**: Track `X-RateLimit-Remaining` to proactively throttle requests\n- **Use streaming for long responses**: Streaming responses don't hold connections during generation\n- **Batch requests when possible**: Combine multiple small requests into larger batches\n",
"description": "**Hadrian Gateway** is an AI Gateway providing a unified OpenAI-compatible API for routing requests to multiple LLM providers.\n\n## Overview\n\nThe gateway provides two main API surfaces:\n\n- **Public API** (`/api/v1/*`) - OpenAI-compatible endpoints for LLM inference. Use these endpoints to create chat completions, text completions, embeddings, and list available models. Requires API key authentication.\n\n- **Admin API** (`/admin/v1/*`) - RESTful management endpoints for multi-tenant configuration. Manage organizations, projects, users, API keys, dynamic providers, usage tracking, and model pricing.\n\n## Authentication\n\nThe gateway supports multiple authentication methods for API access.\n\n### API Key Authentication\n\nAPI keys are the primary authentication method for programmatic access. Keys are created via the Admin API and scoped to organizations, projects, or users.\n\n**Using the Authorization header (recommended):**\n```\nAuthorization: Bearer gw_live_abc123def456...\n```\n\n**Using the X-API-Key header:**\n```\nX-API-Key: gw_live_abc123def456...\n```\n\nBoth headers are supported. The `Authorization: Bearer` format is recommended for compatibility with OpenAI client libraries.\n\n**Example request:**\n```bash\ncurl https://gateway.example.com/api/v1/chat/completions \\\n -H \\\"Authorization: Bearer gw_live_abc123def456...\\\" \\\n -H \\\"Content-Type: application/json\\\" \\\n -d '{\\\"model\\\": \\\"openai/gpt-4\\\", \\\"messages\\\": [{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Hello\\\"}]}'\n```\n\n### JWT Authentication\n\nWhen JWT authentication is enabled, requests can be authenticated using a JWT token from your identity provider.\n\n```\nAuthorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...\n```\n\nThe gateway validates the JWT against the configured JWKS endpoint and extracts the identity from the token claims.\n\n**Example request:**\n```bash\ncurl https://gateway.example.com/api/v1/chat/completions \\\n -H \\\"Authorization: Bearer eyJhbGciOiJSUzI1NiIs...\\\" \\\n -H \\\"Content-Type: application/json\\\" \\\n -d '{\\\"model\\\": \\\"openai/gpt-4\\\", \\\"messages\\\": [{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Hello\\\"}]}'\n```\n\n### Multi-Auth Mode\n\nWhen configured for multi-auth, the gateway accepts both API keys and JWTs using **format-based detection**:\n\n- **X-API-Key header**: Always validated as an API key\n- **Authorization: Bearer header**: Uses format-based detection:\n - Tokens starting with the configured API key prefix (default: `gw_`) are validated as API keys\n - All other tokens are validated as JWTs\n\n**Important:** Providing both `X-API-Key` and `Authorization` headers simultaneously results in a 400 error (ambiguous credentials). Choose one authentication method per request.\n\n**Examples:**\n```bash\n# API key in X-API-Key header\ncurl -H \\\"X-API-Key: gw_live_abc123...\\\" https://gateway.example.com/v1/chat/completions\n\n# API key in Authorization: Bearer header (format-based detection)\ncurl -H \\\"Authorization: Bearer gw_live_abc123...\\\" https://gateway.example.com/v1/chat/completions\n\n# JWT in Authorization: Bearer header\ncurl -H \\\"Authorization: Bearer eyJhbGciOiJSUzI1NiIs...\\\" https://gateway.example.com/v1/chat/completions\n```\n\n### Authentication Errors\n\n| Error Code | HTTP Status | Description | Example Response |\n|------------|-------------|-------------|------------------|\n| `unauthorized` | 401 | No authentication credentials provided | `{\\\"error\\\": {\\\"code\\\": \\\"unauthorized\\\", \\\"message\\\": \\\"Authentication required\\\"}}` |\n| `ambiguous_credentials` | 400 | Both X-API-Key and Authorization headers provided | `{\\\"error\\\": {\\\"code\\\": \\\"ambiguous_credentials\\\", \\\"message\\\": \\\"Ambiguous credentials: provide either X-API-Key or Authorization header, not both\\\"}}` |\n| `invalid_api_key` | 401 | API key is invalid, malformed, or revoked | `{\\\"error\\\": {\\\"code\\\": \\\"invalid_api_key\\\", \\\"message\\\": \\\"Invalid API key\\\"}}` |\n| `not_authenticated` | 401 | JWT validation failed | `{\\\"error\\\": {\\\"code\\\": \\\"not_authenticated\\\", \\\"message\\\": \\\"Token validation failed\\\"}}` |\n| `forbidden` | 403 | Valid credentials but insufficient permissions | `{\\\"error\\\": {\\\"code\\\": \\\"forbidden\\\", \\\"message\\\": \\\"Insufficient permissions\\\"}}` |\n\n### Configuration Examples\n\n**API Key Authentication:**\n```toml\n[auth.mode]\ntype = \\\"api_key\\\"\n\n[auth.api_key]\nheader_name = \\\"X-API-Key\\\" # Header to read API key from\nkey_prefix = \\\"gw_\\\" # Valid key prefix\ncache_ttl_secs = 60 # Cache key lookups for 60 seconds\n```\n\n**IdP Authentication (SSO + API keys + JWT):**\n```toml\n[auth.mode]\ntype = \\\"idp\\\"\n\n[auth.api_key]\nheader_name = \\\"X-API-Key\\\"\nkey_prefix = \\\"gw_\\\"\n\n[auth.session]\nsecure = true\n```\n\n**Identity-Aware Proxy (IAP):**\n```toml\n[auth.mode]\ntype = \\\"iap\\\"\nidentity_header = \\\"X-Forwarded-User\\\"\nemail_header = \\\"X-Forwarded-Email\\\"\n```\n\n## Pagination\n\nAll Admin API list endpoints use **cursor-based pagination** for stable, performant navigation.\n\n**Query Parameters:**\n- `limit` (optional): Maximum records per page (default: 100, max: 1000)\n- `cursor` (optional): Opaque cursor from previous response's `next_cursor` or `prev_cursor`\n- `direction` (optional): `forward` (default) or `backward`\n\n**Response:**\n```json\n{\n \\\"data\\\": [...],\n \\\"pagination\\\": {\n \\\"limit\\\": 100,\n \\\"has_more\\\": true,\n \\\"next_cursor\\\": \\\"MTczMzU4MDgwMDAwMDphYmMxMjM0...\\\",\n \\\"prev_cursor\\\": null\n }\n}\n```\n\n## Model Routing\n\nModels can be addressed in several ways:\n\n- **Static routing**: `provider-name/model-name` routes to config-defined providers\n- **Dynamic routing**: `:org/{ORG}/{PROVIDER}/{MODEL}` routes to database-backed providers\n- **Default**: When no prefix is specified, routes to the default provider\n\n## Error Codes\n\nAll errors follow a consistent JSON format:\n\n```json\n{\n \\\"error\\\": {\n \\\"code\\\": \\\"error_code\\\",\n \\\"message\\\": \\\"Human-readable error message\\\",\n \\\"details\\\": { ... } // Optional additional context\n }\n}\n```\n\n### Authentication & Authorization Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `unauthorized` | 401 | Missing or invalid API key/token |\n| `invalid_api_key` | 401 | API key is invalid, expired, or revoked |\n| `forbidden` | 403 | Valid credentials but insufficient permissions |\n| `not_authenticated` | 401 | Authentication required for this operation |\n\n### Rate Limiting & Budget Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `rate_limit_exceeded` | 429 | Request rate limit exceeded. Check `Retry-After` header. |\n| `budget_exceeded` | 402 | Budget limit exceeded for the configured period. Details include `limit_cents`, `current_spend_cents`, and `period`. |\n| `cache_required` | 503 | Budget enforcement requires cache to be configured |\n\n### Request Validation Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `validation_error` | 400 | Request body validation failed |\n| `bad_request` | 400 | Malformed request |\n| `routing_error` | 400 | Model routing failed (invalid model string or provider not found) |\n| `not_found` | 404 | Requested resource not found |\n| `conflict` | 409 | Resource already exists or conflicts with existing state |\n\n### Provider & Gateway Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `provider_error` | 502 | Upstream LLM provider returned an error |\n| `request_failed` | 502 | Failed to communicate with upstream provider |\n| `circuit_breaker_open` | 503 | Provider circuit breaker is open due to repeated failures |\n| `response_read_error` | 500 | Failed to read provider response |\n| `response_builder` | 500 | Failed to build response from provider data |\n| `internal_error` | 500 | Internal server error |\n\n### Guardrails Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `guardrails_blocked` | 400 | Content blocked by guardrails policy. Response includes `violations` array. |\n| `guardrails_timeout` | 504 | Guardrails evaluation timed out |\n| `guardrails_provider_error` | 502 | Error communicating with guardrails provider |\n| `guardrails_auth_error` | 502 | Authentication failed with guardrails provider |\n| `guardrails_rate_limited` | 429 | Guardrails provider rate limit exceeded |\n| `guardrails_config_error` | 500 | Invalid guardrails configuration |\n| `guardrails_parse_error` | 400 | Failed to parse content for guardrails evaluation |\n\n### Admin API Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `database_required` | 503 | Database not configured (required for admin operations) |\n| `services_required` | 503 | Required services not initialized |\n| `not_configured` | 503 | Required feature or service not configured |\n| `database_error` | 500 | Database operation failed |\n\n## Rate Limiting\n\nThe gateway implements multiple layers of rate limiting to protect against abuse and ensure fair usage.\n\n### Rate Limit Types\n\n| Type | Scope | Default | Description |\n|------|-------|---------|-------------|\n| **Requests per minute** | API Key | 60 | Maximum requests per minute per API key |\n| **Requests per day** | API Key | Unlimited | Optional daily request limit per API key |\n| **Tokens per minute** | API Key | 100,000 | Maximum tokens processed per minute |\n| **Tokens per day** | API Key | Unlimited | Optional daily token limit |\n| **Concurrent requests** | API Key | 10 | Maximum simultaneous in-flight requests |\n| **IP requests per minute** | IP Address | 120 | Rate limit for unauthenticated requests |\n\n### Rate Limit Headers\n\nAll API responses include rate limit information in HTTP headers.\n\n#### Request Rate Limit Headers\n\n| Header | Description | Example |\n|--------|-------------|---------|\n| `X-RateLimit-Limit` | Maximum requests allowed in the current window | `60` |\n| `X-RateLimit-Remaining` | Requests remaining in the current window | `45` |\n| `X-RateLimit-Reset` | Seconds until the rate limit window resets | `42` |\n\n#### Token Rate Limit Headers\n\n| Header | Description | Example |\n|--------|-------------|---------|\n| `X-TokenRateLimit-Limit` | Maximum tokens allowed per minute | `100000` |\n| `X-TokenRateLimit-Remaining` | Tokens remaining in the current minute | `85000` |\n| `X-TokenRateLimit-Used` | Tokens used in the current minute | `15000` |\n| `X-TokenRateLimit-Day-Limit` | Maximum tokens allowed per day (if configured) | `1000000` |\n| `X-TokenRateLimit-Day-Remaining` | Tokens remaining today (if configured) | `950000` |\n\n#### Rate Limit Exceeded Response\n\nWhen a rate limit is exceeded, the API returns HTTP 429 with:\n\n```json\n{\n \\\"error\\\": {\n \\\"code\\\": \\\"rate_limit_exceeded\\\",\n \\\"message\\\": \\\"Rate limit exceeded: 60 requests per minute\\\",\n \\\"details\\\": {\n \\\"limit\\\": 60,\n \\\"window\\\": \\\"minute\\\",\n \\\"retry_after_secs\\\": 42\n }\n }\n}\n```\n\nThe `Retry-After` header indicates seconds to wait before retrying:\n\n```\nHTTP/1.1 429 Too Many Requests\nRetry-After: 42\nX-RateLimit-Limit: 60\nX-RateLimit-Remaining: 0\nX-RateLimit-Reset: 42\n```\n\n### IP-Based Rate Limiting\n\nUnauthenticated requests (requests without a valid API key) are rate limited by IP address. This protects public endpoints like `/health` from abuse.\n\n- **Default:** 120 requests per minute per IP\n- **Client IP Detection:** Respects `X-Forwarded-For` and `X-Real-IP` headers when trusted proxies are configured\n- **Configuration:** Can be disabled or adjusted via `limits.rate_limits.ip_rate_limits` in config\n\n### Rate Limit Configuration\n\nRate limits are configured hierarchically:\n\n1. **Global defaults** (in `hadrian.toml`):\n```toml\n[limits.rate_limits]\nrequests_per_minute = 60\ntokens_per_minute = 100000\nconcurrent_requests = 10\n\n[limits.rate_limits.ip_rate_limits]\nenabled = true\nrequests_per_minute = 120\n```\n\n2. **Per-API key** limits can override global defaults (when creating API keys via Admin API)\n\n### Best Practices\n\n- **Implement exponential backoff**: When receiving 429 responses, wait the `Retry-After` duration before retrying\n- **Monitor rate limit headers**: Track `X-RateLimit-Remaining` to proactively throttle requests\n- **Use streaming for long responses**: Streaming responses don't hold connections during generation\n- **Batch requests when possible**: Combine multiple small requests into larger batches\n",
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OpenAPI description still states that the Public API “Requires API key authentication”, but this PR introduces multiple auth modes (none, idp, iap) where API keys may be optional or JWT/session/proxy headers can authenticate. The description should be updated to reflect the new auth.mode behavior so client developers don’t get misleading guidance.

Copilot uses AI. Check for mistakes.
Comment on lines +1209 to +1221
AuthMode::None => {
// Optional auth: try API key if header present, don't require it
let api_key = try_api_key_auth(headers, state).await?;
let identity = try_identity_auth(headers, connecting_ip, state).await?;
let kind = match (api_key, identity) {
(Some(api_key), Some(identity)) => IdentityKind::Both {
api_key: Box::new(api_key),
identity,
},
(Some(api_key), None) => IdentityKind::ApiKey(api_key),
(None, Some(identity)) => IdentityKind::Identity(identity),
(None, None) => return Err(AuthError::MissingCredentials),
};
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In try_authenticate, the AuthMode::None branch is documented as “optional auth (…don’t require it)”, but it returns Err(MissingCredentials) when no credentials are present. This makes it easy for callers to accidentally treat none mode as requiring auth (or to implement brittle “ignore MissingCredentials” logic). Consider changing the return type to Result<Option<AuthenticatedRequest>, AuthError> (return Ok(None) for truly unauthenticated requests), or at least aligning the docs/behavior so AuthMode::None doesn’t signal an error for the no-credentials case.

Copilot uses AI. Check for mistakes.
@greptile-apps
Copy link

greptile-apps bot commented Mar 2, 2026

Greptile Summary

This PR consolidates authentication from separate gateway and admin configurations into a unified mode system with four clear options: none (local dev), api_key (programmatic only), idp (full SSO + API keys), and iap (reverse proxy auth). The refactoring simplifies the auth architecture by removing the global OIDC/JWT validators in favor of per-organization SSO configurations.

Key Changes

  • Unified auth mode: [auth.mode] replaces separate [auth.gateway] and [auth.admin] sections
  • Per-org SSO only: Removed global OIDC authenticator, all SSO is now organization-scoped
  • Shared session store: Session management consolidated across OIDC/SAML providers
  • Format-based detection: In IdP mode, bearer tokens are routed to API key or JWT validation based on prefix
  • Enhanced SSRF protection: Added allow_private flag for Docker/K8s while always blocking cloud metadata endpoint
  • Bootstrap command: New CLI command for automated initial setup from config

Testing Recommendations

  • Verify all auth modes work correctly in their intended environments
  • Test that None mode properly handles both anonymous requests and optional API key validation
  • Confirm IdP mode correctly rejects ambiguous credentials (simultaneous X-API-Key and Authorization headers)
  • Validate that IAP mode requires trusted proxy configuration to prevent header spoofing
  • Test the Bootstrap command in both dry-run and live modes

Confidence Score: 4/5

  • This PR is safe to merge with careful testing of the new authentication modes in their target environments
  • Large refactoring with extensive changes to authentication logic across 56 files. The consolidation from separate gateway/admin auth to unified modes is architecturally sound and well-documented. SSRF protections are properly maintained. However, the scope of changes affecting authentication and authorization requires thorough testing in all deployment scenarios (None, ApiKey, Idp, Iap modes) before production deployment
  • Pay close attention to src/middleware/combined.rs and src/middleware/admin.rs for authentication flow validation across all modes

Important Files Changed

Filename Overview
src/config/auth.rs Replaced separate gateway/admin auth configs with unified AuthMode enum (None, ApiKey, Idp, Iap) and consolidated helper methods
src/middleware/combined.rs Updated API middleware to use new AuthMode, added session cookie support for IdP mode, implemented format-based detection for API keys vs JWTs
src/middleware/admin.rs Added API key admin auth for ApiKey mode, updated proxy auth to use IAP config, removed global OIDC authenticator dependency
src/main.rs Removed global OIDC/JWT validators, added Bootstrap CLI command, updated route registration to use unified auth mode checks
src/routes/auth.rs Removed global OIDC fallback - all SSO is now per-org, updated session management to use shared registry session store
src/validation/url.rs Added allow_private option for Docker/K8s environments, cloud metadata endpoint (169.254.169.254) always blocked regardless of flags

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    Start[Incoming Request] --> CheckMode{Auth Mode?}
    
    CheckMode -->|None| CheckCreds{Has Credentials?}
    CheckCreds -->|No| AllowAnon[Allow Anonymous]
    CheckCreds -->|Yes| ValidateOptional[Validate API Key]
    ValidateOptional -->|Valid| AuthSuccess[Authenticated]
    ValidateOptional -->|Invalid| AuthFail[401 Unauthorized]
    
    CheckMode -->|ApiKey| ValidateKey[Validate API Key Required]
    ValidateKey -->|Valid| AuthSuccess
    ValidateKey -->|Invalid/Missing| AuthFail
    
    CheckMode -->|Idp| CheckDual{Both X-API-Key<br/>and Authorization?}
    CheckDual -->|Yes| Ambiguous[400 Ambiguous Credentials]
    CheckDual -->|No| TrySession[Try Session Cookie]
    TrySession -->|Valid| AuthSuccess
    TrySession -->|None| TryApiKeyOrJWT[Try API Key or JWT]
    TryApiKeyOrJWT -->|API Key Valid| AuthSuccess
    TryApiKeyOrJWT -->|JWT Valid| AuthSuccess
    TryApiKeyOrJWT -->|None/Invalid| AuthFail
    
    CheckMode -->|Iap| CheckProxy{From Trusted<br/>Proxy?}
    CheckProxy -->|No| ProxyFail[403 Forbidden]
    CheckProxy -->|Yes| TryIapHeaders[Try Identity Headers]
    TryIapHeaders -->|Valid| AuthSuccess
    TryIapHeaders -->|None| TryIapKey[Try API Key]
    TryIapKey -->|Valid| AuthSuccess
    TryIapKey -->|None/Invalid| AuthFail
Loading

Last reviewed commit: 68cccf9

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 55 out of 56 changed files in this pull request and generated 4 comments.

Files not reviewed (1)
  • ui/pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +46 to +53
// Link-local (169.254.0.0/16) — blocked unless allow_private
// (cloud metadata 169.254.169.254 is always blocked above)
if v4.is_link_local() {
return true;
return !opts.allow_private;
}
// Cloud metadata endpoint (169.254.169.254)
if v4 == Ipv4Addr::new(169, 254, 169, 254) {
return true; // Always block, even if allow_loopback
// Private ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)
if v4.is_private() {
return !opts.allow_private;
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allow_private currently also permits IPv4 link-local addresses (169.254.0.0/16) except for the single metadata IP. Link-local is commonly blocked in SSRF defenses because it can still reach sensitive host/network services; folding it into allow_private makes it easy to over-relax SSRF protections unintentionally. Consider keeping link-local blocked by default even when allow_private is true, or introducing a separate allow_link_local flag with explicit docs and config naming.

Copilot uses AI. Check for mistakes.
Comment on lines +601 to +607
let has_credentials = headers
.contains_key(state.config.auth.api_key_config().header_name.as_str())
|| headers.contains_key(axum::http::header::AUTHORIZATION);
let auth_result = if !state.config.auth.is_auth_enabled() && !has_credentials {
Err(AuthError::MissingCredentials)
} else {
try_authenticate(&headers, cookies.as_ref(), connecting_ip, &state).await
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has_credentials checks the configured API key header name, but the later “credentials were provided but invalid” branch only checks literal "X-API-Key" / "Authorization". If a deployment uses a custom header (e.g. "Api-Key"), invalid credentials can be treated as “no credentials” and the request will incorrectly proceed anonymously. Use the configured header name consistently (and/or reuse has_credentials) when deciding whether to reject vs allow anonymous access.

Copilot uses AI. Check for mistakes.
Comment on lines +13 to 14
type = "none"

Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This production configuration sets auth.mode.type to "none", which disables all authentication for both API and admin endpoints. If an operator uses this file as-is, the gateway will run in production with completely unauthenticated access, allowing any unauthenticated user to invoke models and access admin functionality. Change auth.mode.type to a secure mode such as "api_key", "idp", or "iap" by default, and reserve "none" strictly for clearly-labeled development-only configs.

Suggested change
type = "none"
type = "api_key"
# Note: Do NOT use auth.mode.type = "none" in production. Reserve it for clearly-marked
# development-only configs if you need to disable authentication locally.

Copilot uses AI. Check for mistakes.
key_prefix = "gw_"
cache_ttl_secs = 300
[auth.mode]
type = "none"
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PostgreSQL HA production configuration sets auth.mode.type to "none", fully disabling authentication. Deploying this file as-is would expose all gateway and admin endpoints without any access control, enabling unauthorized access to data and model operations. Update the default to a secure mode like "api_key", "idp", or "iap", and ensure "none" is only used in explicitly development/test configurations.

Suggested change
type = "none"
type = "api_key"

Copilot uses AI. Check for mistakes.
@ScriptSmith ScriptSmith merged commit 6073fbf into main Mar 2, 2026
24 checks passed
@ScriptSmith ScriptSmith deleted the rework-auth branch March 2, 2026 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants