feat: componentize gateway, ui-backend, and ui as separate services#27557
Conversation
|
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Greptile SummaryThis PR introduces a componentized deployment scaffold that splits the monolithic LiteLLM proxy into three independently scalable services —
Confidence Score: 3/5Not safe to merge: the migration job will silently skip all schema work and leave the database empty, causing gateway and backend pods to crash at startup. The migration job template pulls in the shared litellm.serverEnv helper which unconditionally sets DISABLE_SCHEMA_UPDATE=true. The existing chart at deploy/charts/litellm-helm explicitly overrides this to false in its migration job for exactly this reason. Without that override, ProxyExtrasDBManager.setup_database() respects the env var and skips the schema deploy, the job exits 0, Helm marks the hook successful, and then all gateway and backend pods fail at startup because the Prisma schema was never created. This is a deployment-blocking defect on every fresh install or upgrade. helm/litellm/templates/migrations-job.yaml needs a DISABLE_SCHEMA_UPDATE=false override after the litellm.serverEnv include; helm/litellm/templates/_helpers.tpl is where the conflicting true value originates.
|
| Filename | Overview |
|---|---|
| helm/litellm/templates/migrations-job.yaml | Migration Job template inherits DISABLE_SCHEMA_UPDATE=true from litellm.serverEnv but never overrides it to false — the existing chart explicitly sets false for this reason, making this a silent migration skip defect. |
| helm/litellm/templates/_helpers.tpl | Shared env helper correctly assembles DB credentials via discrete vars (avoiding URL encoding issues) and properly injects DISABLE_SCHEMA_UPDATE=true for app pods; masterKey.secretName is now required rather than inline. |
| litellm/proxy/db/db_url_settings.py | New DatabaseURLSettings class cleanly assembles DATABASE_URL from discrete env vars with proper percent-encoding and IAM token minting; reader URL is opt-in and never clobbers pre-existing values. |
| gateway/main.py | Gateway entrypoint wraps proxy_server lifespan correctly (filter runs after startup, before requests) and assembles DATABASE_URL before importing proxy_server. |
| backend/main.py | Backend entrypoint mirrors gateway pattern correctly; lifespan-based route filter runs after startup hooks complete. |
| helm/litellm/templates/ingress.yaml | Ingress correctly handles /test as Exact match to avoid routing MCP management endpoints to gateway; *.txt path for RSC payloads uses ImplementationSpecific (AWS ALB target) as documented. |
| tests/test_litellm/proxy/db/test_db_url_settings.py | Comprehensive unit tests covering IAM and password auth for writer and reader, percent-encoding, no-clobber semantics, and field fallback logic — all using mocks without real network calls. |
Reviews (12): Last reviewed commit: "feat: add componentized proxy deployment..." | Re-trigger Greptile
PR overviewComponentized gateway, UI backend, and UI servicesThis PR splits the monolithic proxy app into separate gateway/backend/UI entrypoints, adds Helm routing for those surfaces, and introduces DB URL assembly for direct uvicorn startup paths. I checked the route allowlists, ingress dispatch, service exposure defaults, and credential handling in the database URL helper and did not find an attacker-reachable security regression. Security review
Risk: 2/10 |
9905afc to
d82e1f5
Compare
d82e1f5 to
b96d75f
Compare
3e560e0 to
dc06b28
Compare
7998529 to
0e24c00
Compare
|
@greptileai — review comments addressed in 98e07fc:
Could you re-review? |
|
@greptileai — addressed the reader-URL clobber issue in e34640c:
Could you take one more pass? |
e34640c to
d5aa1b2
Compare
409dda6 to
68197fd
Compare
a59281c to
50df072
Compare
53ba530 to
ee69e96
Compare
9a68b29 to
dc56942
Compare
dc56942 to
91dbcea
Compare
| token = rds_iam_token.generate_iam_auth_token( | ||
| db_host=host, db_port=self.database_port, db_user=user | ||
| ) | ||
| url = f"postgresql://{user}:{token}@{host}:{self.database_port}/{name}" |
There was a problem hiding this comment.
IAM URL path doesn't percent-encode user or database name
Medium Severity
In build_writer_url and build_reader_url, the IAM auth path embeds user and name raw into the PostgreSQL URL, while _password_url in the same module correctly percent-encodes them with urllib.parse.quote_plus. If either value contains URL-special characters (@, :, /, #, %), the IAM-assembled URL is malformed and the database connection silently fails or connects to the wrong database.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 91dbcea5aa6259deb0008142fb4ed743a8410870. Configure here.
| name: {{ include "litellm.gateway.fullname" . }}-config | ||
| data: | ||
| config.yaml: | | ||
| {{ .Values.gateway.config.proxy_config | toYaml | indent 6 }} |
There was a problem hiding this comment.
ConfigMap renders proxy_config dict value as YAML string
Medium Severity
The toYaml pipe on .Values.gateway.config.proxy_config converts the dict to a YAML string representation and embeds it inside a literal block scalar (|). This means the config.yaml key holds a string of YAML text, not a structured YAML mapping. When the proxy reads this mounted file, it gets valid YAML text that parses fine. However, if proxy_config is an empty map {} (the default), the file content is the literal string {} — which is fine. The real problem is that gateway.config.create defaults to true and proxy_config defaults to {}, so every installation creates this ConfigMap, mounts it, and sets CONFIG_FILE_PATH in the gateway deployment, even when the user has no config to provide. This forces the proxy to load an empty config file on every gateway pod, which could override or conflict with database-stored configuration depending on how the proxy merges config sources.
Reviewed by Cursor Bugbot for commit 91dbcea5aa6259deb0008142fb4ed743a8410870. Configure here.
…nd migrations Split the monolithic LiteLLM proxy into independently scalable Kubernetes components to allow separate horizontal scaling of the LLM data plane and management API surfaces - Add DatabaseURLSettings pydantic-settings model that assembles DATABASE_URL (and optional DATABASE_URL_READ_REPLICA) from discrete DATABASE_* env vars before Prisma initializes, supporting both IAM token auth (minting short-lived RDS tokens) and password auth; replaces the CLI-only path that componentized entrypoints bypass - Add gateway component (port 4000) that trims the proxy route table to the LLM data-plane surface (chat, embeddings, completions, audio, realtime, provider passthroughs, health/metrics) via an allowlist applied inside the lifespan context so plugin-registered routes are captured - Add backend component (port 4001) that exposes the management/admin surface (keys, users, teams, orgs, spend analytics, model management, SSO, audit logs) with a complementary allowlist - Add ui component — Next.js static export served by nginx (port 3000) with RSC payload routing, asset prefix aliasing, and SPA fallback for dashboard routes - Add migrations component with dedicated Dockerfile that runs prisma migrate deploy via a Helm pre-install/pre-upgrade Job, eliminating per-pod schema contention on the Prisma advisory lock - Add Helm chart (helm/litellm) with separate Deployments, Services, HPAs, and ConfigMap for each component; shared _helpers.tpl emits DATABASE_*, IAM_TOKEN_DB_AUTH, REDIS_*, and DISABLE_SCHEMA_UPDATE env vars from chart values; ingress template routes traffic to the correct component by path prefix - Add comprehensive tests for DatabaseURLSettings covering IAM auth, password auth, read replica fallbacks, operator-pinned URL preservation, and percent-encoding; add coverage test asserting gateway + backend allowlist union equals the full proxy route set - Add pydantic-settings>=2.14.1 as a proxy extra dependency and update liccheck allowlist
91dbcea to
0691008
Compare
| env: | ||
| {{- include "litellm.serverEnv" (dict "root" $ "component" .Values.migrationJob) | nindent 12 }} | ||
| {{- with .Values.migrationJob.resources }} | ||
| resources: |
There was a problem hiding this comment.
Migration job inherits
DISABLE_SCHEMA_UPDATE=true, likely silently skipping all migrations
The litellm.serverEnv helper always emits DISABLE_SCHEMA_UPDATE=true (line 196 of _helpers.tpl). The existing Helm chart at deploy/charts/litellm-helm/templates/migrations-job.yaml explicitly overrides this with DISABLE_SCHEMA_UPDATE=false for the migration job, with the comment "always run the migration from the Helm PreSync hook, override the value set". That override exists precisely because ProxyExtrasDBManager.setup_database() (called by migrations/run.py) respects this env var. Without it, the job will call setup_database(use_migrate=True) while DISABLE_SCHEMA_UPDATE=true is set, causing the schema update to be skipped. The job exits 0, Helm reports the hook as successful, and then every gateway and backend pod crashes on startup with a Prisma table-not-found error. Add - name: DISABLE_SCHEMA_UPDATE / value: "false" after the litellm.serverEnv block in the migration container's env to match the existing chart pattern.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using high mode and found 1 potential issue.
There are 3 total unresolved issues (including 2 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 0691008. Configure here.
| envSecrets: [] # Add extra environment variables to the gateway from secrets | ||
| config: | ||
| create: true | ||
| proxy_config: {} |
There was a problem hiding this comment.
Gateway ConfigMap created with empty config by default
Medium Severity
The defaults gateway.config.create: true and gateway.config.proxy_config: {} cause every fresh install to mount an empty config.yaml and set CONFIG_FILE_PATH. This means the gateway always starts with an empty proxy configuration (no model_list, no litellm_settings), potentially surprising users who expect config-less behavior (where the proxy reads only from the DB). Users unaware of this would need to explicitly set config.create: false to avoid the empty config file being injected.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 0691008. Configure here.
014cb8f
into
litellm_internal_staging


Adds gateway/, backend/, ui/Dockerfile, and chart/helm/litellm/ as an additive scaffold for splitting the monolithic proxy + UI into three independently scalable services. New entrypoints reuse the existing FastAPI app via route allowlists (no edits to existing modules); the new Helm chart gives each component its own Deployment, Service, and HPA with no bundled subcharts.
Relevant issues
Linear ticket
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Screenshots / Proof of Fix
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes
Note
Medium Risk
Introduces new runtime entrypoints, route allowlists, and Helm ingress/routing that can break API/UI availability if any path is misclassified, plus new DB URL assembly/migration job wiring that impacts startup and schema management.
Overview
Adds a componentized deployment scaffold that runs the existing
litellm.proxy.proxy_serveras two separate FastAPI processes:gateway.main(LLM data-plane only) andbackend.main(management/UI API only), each trimming the route table at startup using explicit allowlists.Introduces new container images (
gateway,backend,ui, andmigrations) and a newhelm/litellmchart that deploys each as its own Deployment/Service/HPA, optionally fronted by an Ingress that routes UI paths to nginx, data-plane prefixes to the gateway, and all remaining API traffic to the backend.Adds
DatabaseURLSettingsto assembleDATABASE_URL(and optional read-replica URL) from discrete env vars with IAM-token and percent-encoding support, and wires a pre-install/pre-upgrade migrations Job to runprisma migrate deploywhile disabling schema updates in app pods; includes tests to validate DB URL assembly and ensure gateway+backend allowlists cover all proxy routes.Reviewed by Cursor Bugbot for commit 0691008. Bugbot is set up for automated code reviews on this repo. Configure here.