Skip to content

feat: componentize gateway, ui-backend, and ui as separate services#27557

Merged
yassin-berriai merged 1 commit into
litellm_internal_stagingfrom
litellm_fix/componentization
May 16, 2026
Merged

feat: componentize gateway, ui-backend, and ui as separate services#27557
yassin-berriai merged 1 commit into
litellm_internal_stagingfrom
litellm_fix/componentization

Conversation

@yassin-berriai

@yassin-berriai yassin-berriai commented May 9, 2026

Copy link
Copy Markdown
Contributor

Adds gateway/, backend/, ui/Dockerfile, and chart/helm/litellm/ as an additive scaffold for splitting the monolithic proxy + UI into three independently scalable services. New entrypoints reuse the existing FastAPI app via route allowlists (no edits to existing modules); the new Helm chart gives each component its own Deployment, Service, and HPA with no bundled subcharts.

Screenshot 2026-05-15 at 7 03 23 PM

Relevant issues

Linear ticket

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Screenshots / Proof of Fix

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes


Note

Medium Risk
Introduces new runtime entrypoints, route allowlists, and Helm ingress/routing that can break API/UI availability if any path is misclassified, plus new DB URL assembly/migration job wiring that impacts startup and schema management.

Overview
Adds a componentized deployment scaffold that runs the existing litellm.proxy.proxy_server as two separate FastAPI processes: gateway.main (LLM data-plane only) and backend.main (management/UI API only), each trimming the route table at startup using explicit allowlists.

Introduces new container images (gateway, backend, ui, and migrations) and a new helm/litellm chart that deploys each as its own Deployment/Service/HPA, optionally fronted by an Ingress that routes UI paths to nginx, data-plane prefixes to the gateway, and all remaining API traffic to the backend.

Adds DatabaseURLSettings to assemble DATABASE_URL (and optional read-replica URL) from discrete env vars with IAM-token and percent-encoding support, and wires a pre-install/pre-upgrade migrations Job to run prisma migrate deploy while disabling schema updates in app pods; includes tests to validate DB URL assembly and ensure gateway+backend allowlists cover all proxy routes.

Reviewed by Cursor Bugbot for commit 0691008. Bugbot is set up for automated code reviews on this repo. Configure here.

@CLAassistant

CLAassistant commented May 9, 2026

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@codecov

codecov Bot commented May 9, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.80851% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
litellm/proxy/db/db_url_settings.py 96.80% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

@greptile-apps

greptile-apps Bot commented May 9, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces a componentized deployment scaffold that splits the monolithic LiteLLM proxy into three independently scalable services — gateway (LLM data-plane), backend (management/admin API), and ui (nginx-served Next.js static export) — each running as a separate container with its own Helm Deployment, Service, and HPA. A new DatabaseURLSettings class assembles DATABASE_URL (and optional read-replica URL) from discrete env vars with proper percent-encoding and IAM token support, replacing the CLI-only assembly path.

  • Gateway and backend entrypoints reuse litellm.proxy.proxy_server:app and trim its route table inside a lifespan wrapper (correctly running after startup hooks), with comprehensive allowlists validated by a union-coverage test.
  • Helm chart (helm/litellm/) wires all three components via a shared litellm.serverEnv helper that emits discrete DB vars (never assembles DATABASE_URL itself, avoiding URL-encoding corruption) and a pre-install,pre-upgrade migrations Job using the dedicated litellm-migrations image.
  • One defect found: the litellm.serverEnv helper always emits DISABLE_SCHEMA_UPDATE=true, and the migration Job template consumes this helper without overriding the value to false. The existing chart (deploy/charts/litellm-helm) explicitly sets DISABLE_SCHEMA_UPDATE=false for its migration job with the comment "always run the migration from the Helm PreSync hook, override the value set" — indicating this override is required. Without it, ProxyExtrasDBManager.setup_database() skips all schema work, the job exits 0, and every gateway/backend pod crashes on startup with Prisma table-not-found errors.

Confidence Score: 3/5

Not safe to merge: the migration job will silently skip all schema work and leave the database empty, causing gateway and backend pods to crash at startup.

The migration job template pulls in the shared litellm.serverEnv helper which unconditionally sets DISABLE_SCHEMA_UPDATE=true. The existing chart at deploy/charts/litellm-helm explicitly overrides this to false in its migration job for exactly this reason. Without that override, ProxyExtrasDBManager.setup_database() respects the env var and skips the schema deploy, the job exits 0, Helm marks the hook successful, and then all gateway and backend pods fail at startup because the Prisma schema was never created. This is a deployment-blocking defect on every fresh install or upgrade.

helm/litellm/templates/migrations-job.yaml needs a DISABLE_SCHEMA_UPDATE=false override after the litellm.serverEnv include; helm/litellm/templates/_helpers.tpl is where the conflicting true value originates.

Important Files Changed

Filename Overview
helm/litellm/templates/migrations-job.yaml Migration Job template inherits DISABLE_SCHEMA_UPDATE=true from litellm.serverEnv but never overrides it to false — the existing chart explicitly sets false for this reason, making this a silent migration skip defect.
helm/litellm/templates/_helpers.tpl Shared env helper correctly assembles DB credentials via discrete vars (avoiding URL encoding issues) and properly injects DISABLE_SCHEMA_UPDATE=true for app pods; masterKey.secretName is now required rather than inline.
litellm/proxy/db/db_url_settings.py New DatabaseURLSettings class cleanly assembles DATABASE_URL from discrete env vars with proper percent-encoding and IAM token minting; reader URL is opt-in and never clobbers pre-existing values.
gateway/main.py Gateway entrypoint wraps proxy_server lifespan correctly (filter runs after startup, before requests) and assembles DATABASE_URL before importing proxy_server.
backend/main.py Backend entrypoint mirrors gateway pattern correctly; lifespan-based route filter runs after startup hooks complete.
helm/litellm/templates/ingress.yaml Ingress correctly handles /test as Exact match to avoid routing MCP management endpoints to gateway; *.txt path for RSC payloads uses ImplementationSpecific (AWS ALB target) as documented.
tests/test_litellm/proxy/db/test_db_url_settings.py Comprehensive unit tests covering IAM and password auth for writer and reader, percent-encoding, no-clobber semantics, and field fallback logic — all using mocks without real network calls.

Reviews (12): Last reviewed commit: "feat: add componentized proxy deployment..." | Re-trigger Greptile

Comment thread gateway/Dockerfile
Comment thread gateway/main.py
Comment thread gateway/routes/allowlist.py Outdated
@veria-ai

veria-ai Bot commented May 9, 2026

Copy link
Copy Markdown
Contributor

PR overview

Componentized gateway, UI backend, and UI services

This PR splits the monolithic proxy app into separate gateway/backend/UI entrypoints, adds Helm routing for those surfaces, and introduces DB URL assembly for direct uvicorn startup paths. I checked the route allowlists, ingress dispatch, service exposure defaults, and credential handling in the database URL helper and did not find an attacker-reachable security regression.

Security review

  • No new security issues were flagged in the latest review.
  • No review issues remain open on this pull request.

Risk: 2/10

@yassin-berriai yassin-berriai marked this pull request as draft May 9, 2026 23:13
@yassin-berriai yassin-berriai force-pushed the litellm_fix/componentization branch from 9905afc to d82e1f5 Compare May 11, 2026 18:18
@yassin-berriai

Copy link
Copy Markdown
Contributor Author

@greptileai

Comment thread helm/litellm/templates/ingress.yaml Outdated
@yassin-berriai yassin-berriai force-pushed the litellm_fix/componentization branch from d82e1f5 to b96d75f Compare May 11, 2026 19:31
@yassin-berriai

Copy link
Copy Markdown
Contributor Author

@greptileai

Comment thread gateway/Dockerfile Outdated
Comment thread tests/test_litellm/proxy/test_component_allowlists.py
@yassin-berriai yassin-berriai force-pushed the litellm_fix/componentization branch from 3e560e0 to dc06b28 Compare May 11, 2026 19:42
@yassin-berriai

Copy link
Copy Markdown
Contributor Author

@greptileai

@yassin-berriai yassin-berriai changed the title scaffold: componentize gateway, ui-backend, and ui as separate services feat: componentize gateway, ui-backend, and ui as separate services May 11, 2026
Comment thread helm/litellm/templates/migrations-job.yaml Outdated
@yassin-berriai yassin-berriai force-pushed the litellm_fix/componentization branch 6 times, most recently from 7998529 to 0e24c00 Compare May 13, 2026 06:09
@yassin-berriai

Copy link
Copy Markdown
Contributor Author

@greptileai — review comments addressed in 98e07fc:

  • Migration job hook: post-install,post-upgradepre-install,pre-upgrade (schema lands before pods roll out)
  • Component route filter: moved out of module scope into a wrapped lifespan context so it runs after proxy_server startup hooks, catching any dynamically-registered routes (applied to both gateway/main.py and backend/main.py)
  • Component allowlist test: DATABASE_URL + LITELLM_MASTER_KEY pinned before proxy_server import so the test runs without a live DB

Could you re-review?

Comment thread litellm/proxy/auth/rds_iam_token.py Outdated
@yassin-berriai

Copy link
Copy Markdown
Contributor Author

@greptileai — addressed the reader-URL clobber issue in e34640c:

init_iam_db_url_from_env() now only mints DATABASE_URL_READ_REPLICA when (a) DATABASE_HOST_READ_REPLICA is set AND (b) DATABASE_URL_READ_REPLICA is not already set. A pre-pinned reader URL — non-IAM or precomputed — is left untouched. Behavior is locked in by 7 new tests in tests/test_litellm/proxy/auth/test_rds_iam_token.py.

Could you take one more pass?

@yassin-berriai yassin-berriai force-pushed the litellm_fix/componentization branch from e34640c to d5aa1b2 Compare May 14, 2026 19:37
Comment thread litellm/proxy/auth/rds_iam_token.py Outdated
@yassin-berriai yassin-berriai force-pushed the litellm_fix/componentization branch 2 times, most recently from 409dda6 to 68197fd Compare May 14, 2026 21:40
@yassin-berriai yassin-berriai force-pushed the litellm_fix/componentization branch from a59281c to 50df072 Compare May 15, 2026 20:43
@yassin-berriai yassin-berriai force-pushed the litellm_fix/componentization branch 2 times, most recently from 53ba530 to ee69e96 Compare May 16, 2026 00:50
@yassin-berriai

Copy link
Copy Markdown
Contributor Author

@greptileai

Comment thread helm/litellm/templates/ingress.yaml
@yassin-berriai

Copy link
Copy Markdown
Contributor Author

@greptileai

@yassin-berriai yassin-berriai force-pushed the litellm_fix/componentization branch from 9a68b29 to dc56942 Compare May 16, 2026 04:52
@yassin-berriai

Copy link
Copy Markdown
Contributor Author

@greptileai

Comment thread helm/litellm/templates/_helpers.tpl
Comment thread gateway/main.py
Comment thread helm/litellm/templates/ingress.yaml
Comment thread helm/litellm/templates/ingress.yaml
@yassin-berriai yassin-berriai marked this pull request as ready for review May 16, 2026 05:13
@yassin-berriai yassin-berriai force-pushed the litellm_fix/componentization branch from dc56942 to 91dbcea Compare May 16, 2026 05:19
@yassin-berriai yassin-berriai enabled auto-merge (squash) May 16, 2026 05:20
@yassin-berriai

Copy link
Copy Markdown
Contributor Author

@greptileai

token = rds_iam_token.generate_iam_auth_token(
db_host=host, db_port=self.database_port, db_user=user
)
url = f"postgresql://{user}:{token}@{host}:{self.database_port}/{name}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IAM URL path doesn't percent-encode user or database name

Medium Severity

In build_writer_url and build_reader_url, the IAM auth path embeds user and name raw into the PostgreSQL URL, while _password_url in the same module correctly percent-encodes them with urllib.parse.quote_plus. If either value contains URL-special characters (@, :, /, #, %), the IAM-assembled URL is malformed and the database connection silently fails or connects to the wrong database.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 91dbcea5aa6259deb0008142fb4ed743a8410870. Configure here.

name: {{ include "litellm.gateway.fullname" . }}-config
data:
config.yaml: |
{{ .Values.gateway.config.proxy_config | toYaml | indent 6 }}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConfigMap renders proxy_config dict value as YAML string

Medium Severity

The toYaml pipe on .Values.gateway.config.proxy_config converts the dict to a YAML string representation and embeds it inside a literal block scalar (|). This means the config.yaml key holds a string of YAML text, not a structured YAML mapping. When the proxy reads this mounted file, it gets valid YAML text that parses fine. However, if proxy_config is an empty map {} (the default), the file content is the literal string {} — which is fine. The real problem is that gateway.config.create defaults to true and proxy_config defaults to {}, so every installation creates this ConfigMap, mounts it, and sets CONFIG_FILE_PATH in the gateway deployment, even when the user has no config to provide. This forces the proxy to load an empty config file on every gateway pod, which could override or conflict with database-stored configuration depending on how the proxy merges config sources.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 91dbcea5aa6259deb0008142fb4ed743a8410870. Configure here.

…nd migrations

Split the monolithic LiteLLM proxy into independently scalable Kubernetes components to allow separate horizontal scaling of the LLM data plane and management API surfaces

- Add DatabaseURLSettings pydantic-settings model that assembles DATABASE_URL (and optional DATABASE_URL_READ_REPLICA) from discrete DATABASE_* env vars before Prisma initializes, supporting both IAM token auth (minting short-lived RDS tokens) and password auth; replaces the CLI-only path that componentized entrypoints bypass
- Add gateway component (port 4000) that trims the proxy route table to the LLM data-plane surface (chat, embeddings, completions, audio, realtime, provider passthroughs, health/metrics) via an allowlist applied inside the lifespan context so plugin-registered routes are captured
- Add backend component (port 4001) that exposes the management/admin surface (keys, users, teams, orgs, spend analytics, model management, SSO, audit logs) with a complementary allowlist
- Add ui component — Next.js static export served by nginx (port 3000) with RSC payload routing, asset prefix aliasing, and SPA fallback for dashboard routes
- Add migrations component with dedicated Dockerfile that runs prisma migrate deploy via a Helm pre-install/pre-upgrade Job, eliminating per-pod schema contention on the Prisma advisory lock
- Add Helm chart (helm/litellm) with separate Deployments, Services, HPAs, and ConfigMap for each component; shared _helpers.tpl emits DATABASE_*, IAM_TOKEN_DB_AUTH, REDIS_*, and DISABLE_SCHEMA_UPDATE env vars from chart values; ingress template routes traffic to the correct component by path prefix
- Add comprehensive tests for DatabaseURLSettings covering IAM auth, password auth, read replica fallbacks, operator-pinned URL preservation, and percent-encoding; add coverage test asserting gateway + backend allowlist union equals the full proxy route set
- Add pydantic-settings>=2.14.1 as a proxy extra dependency and update liccheck allowlist
Comment thread helm/litellm/templates/_helpers.tpl Outdated
@yassin-berriai yassin-berriai force-pushed the litellm_fix/componentization branch from 91dbcea to 0691008 Compare May 16, 2026 05:53
@yassin-berriai

Copy link
Copy Markdown
Contributor Author

@greptileai

@yassin-berriai yassin-berriai disabled auto-merge May 16, 2026 05:53
env:
{{- include "litellm.serverEnv" (dict "root" $ "component" .Values.migrationJob) | nindent 12 }}
{{- with .Values.migrationJob.resources }}
resources:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Migration job inherits DISABLE_SCHEMA_UPDATE=true, likely silently skipping all migrations

The litellm.serverEnv helper always emits DISABLE_SCHEMA_UPDATE=true (line 196 of _helpers.tpl). The existing Helm chart at deploy/charts/litellm-helm/templates/migrations-job.yaml explicitly overrides this with DISABLE_SCHEMA_UPDATE=false for the migration job, with the comment "always run the migration from the Helm PreSync hook, override the value set". That override exists precisely because ProxyExtrasDBManager.setup_database() (called by migrations/run.py) respects this env var. Without it, the job will call setup_database(use_migrate=True) while DISABLE_SCHEMA_UPDATE=true is set, causing the schema update to be skipped. The job exits 0, Helm reports the hook as successful, and then every gateway and backend pod crashes on startup with a Prisma table-not-found error. Add - name: DISABLE_SCHEMA_UPDATE / value: "false" after the litellm.serverEnv block in the migration container's env to match the existing chart pattern.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using high mode and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0691008. Configure here.

Comment thread helm/litellm/values.yaml
envSecrets: [] # Add extra environment variables to the gateway from secrets
config:
create: true
proxy_config: {}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gateway ConfigMap created with empty config by default

Medium Severity

The defaults gateway.config.create: true and gateway.config.proxy_config: {} cause every fresh install to mount an empty config.yaml and set CONFIG_FILE_PATH. This means the gateway always starts with an empty proxy configuration (no model_list, no litellm_settings), potentially surprising users who expect config-less behavior (where the proxy reads only from the DB). Users unaware of this would need to explicitly set config.create: false to avoid the empty config file being injected.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 0691008. Configure here.

@yassin-berriai yassin-berriai merged commit 014cb8f into litellm_internal_staging May 16, 2026
113 of 116 checks passed
@yassin-berriai yassin-berriai deleted the litellm_fix/componentization branch May 16, 2026 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants