chore(release): backport the 1.84.8 patch set + deps bump to stable/1.85.x and cut 1.85.6 by yuneng-berri · Pull Request #30404 · BerriAI/litellm

yuneng-berri · 2026-06-14T00:04:19Z

Relevant issues

Backports the patch set that was just cut into 1.84.8 onto the 1.85.x line, plus a dependency bump that is scoped to 1.85 and newer, and cuts 1.85.6. The 1.85.x line branched from staging on 2026-05-13, before any of these merged, so it carried none of them; 1.84.x already has the fix set as of 1.84.8, so this keeps an upgrade from 1.84.8 to 1.85.x monotonic. Every code commit here is a git cherry-pick -x of a commit reachable from litellm_internal_staging; the version bump and uv.lock refresh are the only non-pick commits.

What is included

In staging-merge order:

fix(router): use forwarded model_id for native Azure container IDs #27921 fix(router): use forwarded model_id for native Azure container IDs
fix(proxy): expose Prisma idle/connect timeout + extra DB URL params #28395 fix(proxy): expose Prisma idle/connect timeout + extra DB URL params
fix(proxy): recover from cached-plan errors by reconnecting the Prisma client #29983 fix(proxy): recover from cached-plan errors by reconnecting the Prisma client
feat(proxy): add option to disable server-side prepared statements for DB lookups #29984 feat(proxy): add option to disable server-side prepared statements for DB lookups
fix(proxy): return 5xx on DB infra errors during auth; reserve 401 for genuine auth failures #29986 fix(proxy): return 5xx on DB infra errors during auth; reserve 401 for genuine auth failures
fix(anthropic_passthrough): resolve costing model from message_start chunk, litellm_params and model_group instead of 'unknown' #30160 fix(passthrough): resolve anthropic costing model when the body model is "unknown"
fix(passthrough): skip [DONE] sentinels and non-JSON SSE frames in anthropic streaming logging (staging commit 973c7eb; its content lives in the staging aggregator feat: litellm oss 110626 #30202, content-verified, so it carries a footer to feat: litellm oss 110626 #30202's squash cfcdf87)
fix(proxy): grace-period key rotation 401s; return deprecated-key lookup result directly #30327 fix(proxy): return the deprecated-key lookup result directly in the get_data combined view
chore(deps): bump vitest, brace-expansion, pypdf and tornado #30220 chore(deps): bump vitest, brace-expansion, pypdf and tornado
bump: version 1.85.5 -> 1.85.6
chore: refresh uv.lock for 1.85.6

Adaptation notes

Seven of the nine picks are adapted; the divergence from the staging commit in each case is below. #28395 and 973c7eb are byte-for-byte (patch-id) identical to staging.

fix(router): use forwarded model_id for native Azure container IDs #27921: dropped the regenerable litellm/proxy/_experimental/out/ UI build artifact from the pick; that file is a build output the proxy regenerates, and the 1.85.x UI output uses a different layout. The router/transformation/container/ownership source and the test are taken as-is
fix(proxy): recover from cached-plan errors by reconnecting the Prisma client #29983 and fix(proxy): grace-period key rotation 401s; return deprecated-key lookup result directly #30327: dropped their test hunks. Both target tests/test_litellm/proxy/utils/prisma_and_spend/test_prisma_client_get_data.py, which lives in a staging-only pin-harness directory (a 15-file module with a shared conftest) that does not exist on 1.85.x. Importing it would drag in behavior pins for methods that have drifted on this line. The source fixes in litellm/proxy/utils.py apply cleanly and are symbol-verified (_query_first_with_cached_plan_fallback and attempt_db_reconnect both already exist on the line with the matching signature)
feat(proxy): add option to disable server-side prepared statements for DB lookups #29984: dropped the ui/litellm-dashboard/src/lib/http/schema.d.ts hunk; that generated UI types file is not present on 1.85.x. The CLI option in proxy_cli.py + _types.py and its test apply cleanly
fix(proxy): return 5xx on DB infra errors during auth; reserve 401 for genuine auth failures #29986: kept the three new auth regression tests (503-on-DB-infra-error, 401-stays-401, success-unaffected) but removed their patches of auth_exception_handler.seed_request_identity. That symbol comes from an unrelated identity-seeding change (fix: 400 on Anthropic context overflow; seed identity on failed auth #29848) that is not on 1.85.x; the line's auth failure path never calls it, so patching it both failed and was unnecessary. The fix itself does not reference the symbol
fix(anthropic_passthrough): resolve costing model from message_start chunk, litellm_params and model_group instead of 'unknown' #30160: took the six new costing tests and dropped one neighbor test (test_passthrough_logging_sets_response_cost_with_server_tool_use_dict) that rode in on the conflict block but is not part of fix(anthropic_passthrough): resolve costing model from message_start chunk, litellm_params and model_group instead of 'unknown' #30160's diff
chore(deps): bump vitest, brace-expansion, pypdf and tornado #30220: kept the line's existing pinned-version dependencies and applied only the parts of the bump that this PR introduces: pypdf to 6.13.1, the [tool.uv] constraint-dependencies for tornado>=6.5.6 and aiohttp>=3.13.5,<3.14, and the dashboard bumps vitest 3.2.6 + brace-expansion 5.0.6. uv.lock and package-lock.json were regenerated fresh rather than taken verbatim; the resolved versions match staging (pypdf 6.13.1, tornado 6.5.7, aiohttp 3.13.5, vitest 3.2.6, brace-expansion 5.0.6)

Known noise on this line

One pre-existing test failure unrelated to the picks: tests/test_litellm/containers/test_azure_container_transformation.py::TestAzureContainerConfig::test_validate_environment_uses_azure_env_var. It reads a real AZURE_API_KEY from a local .env that litellm loads by walking up the directory tree, which beats the test's monkeypatch.setenv. It is environmental and passes in clean CI. It is present identically before and after the picks.

Pre-Submission checklist

I have added meaningful tests
My PR passes all unit tests
My PR's scope is as isolated as possible
I have requested a Greptile review and received a Confidence Score of at least 4/5

Type

🐛 Bug Fix
🚄 Infrastructure
✅ Test

Changes

See the pick list above. Net: nine cherry-picks (eight source fixes and one dependency bump), the version bump to 1.85.6, and a uv.lock refresh.

Screenshots / Proof of Fix

Live proxy on a real Postgres DB with real provider keys, running the worktree's code with all nine picks applied.

Boot and health:

GET /health/liveliness  -> "I'm alive!"
GET /health/readiness   -> status: healthy, db: connected

Real completion (router path):

curl .../v1/chat/completions -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Reply with exactly: backport ok"}],"max_tokens":12}'
-> "backport ok"

Key generation, then a scoped call with that key (the auth lookup path #29986 touches):

curl .../key/generate -d '{"models":["gpt-4o"],"max_budget":1}'   -> sk-Cg7Dxp...
curl .../v1/chat/completions  (Bearer sk-Cg7Dxp...)               -> "scoped ok"

Anthropic passthrough streaming (the [DONE]/non-JSON-SSE logging path 973c7eb hardens and the costing path #30160 fixes):

curl .../anthropic/v1/messages -d '{"model":"claude-haiku-4-5","max_tokens":20,"stream":true,...}'
-> clean SSE: message_start, content_block_start, ping, content_block_delta x2, content_block_stop
-> real claude-haiku-4-5 output, no errors in the streaming logging handler

Targeted test delta vs a baseline captured on the line tip before any pick: baseline 1 failed (the known-noise above), 206 passed; after the picks 1 failed (same known-noise), 267 passed. Zero new failures, and the 61 new passing tests are the picks' own regression tests.

Gauntlet (deep, universal hypothesis over all nine picks): SURVIVED. Ten adversarial agents independently confirmed each pick's source diff matches its staging counterpart except for the documented adaptations, every referenced symbol resolves on this tree (including that the removed seed_request_identity is absent from all production code), the kept tests pass, and no drifted caller is broken by a pick. The one failing test in the run is the known-noise Azure env-key case above; the gauntlet traced it to the host .env resolution order and confirmed it is pre-existing and out of scope.

…27921) * fix(router): use forwarded model_id for native Azure container IDs in _init_containers_api_endpoints Azure code-interpreter containers return provider-native IDs (cntr_ + hex) that carry no LiteLLM routing payload, so _decode_container_id returns model_id=None. The router was falling through to call the handler directly, bypassing _ageneric_api_call_with_fallbacks and leaving api_base=None for Azure deployments. Fall back to the model_id forwarded from the proxy ownership check so deployment credentials are always applied. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(azure-containers): strip /openai/responses path from api_base in AzureContainerConfig.get_complete_url When a deployment's api_base is the responses endpoint URL (e.g. .../openai/responses?api-version=...), AzureContainerConfig was appending /openai/containers on top of it, producing the broken path .../openai/responses/openai/containers. Azure returns 404 for that URL while the correct path is .../openai/containers. Strip any /openai/responses suffix from api_base before constructing the containers URL so the resource root is always used as the starting point. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(azure-containers): prefer api-version from api_base URL over deployment's api_version The deployment's api_version (e.g. 2024-08-01-preview) targets the chat/responses API and is too old for the containers API, which requires 2025-04-01-preview. The responses endpoint api_base already carries the correct api-version in its query string. Extract it and use it for the containers URL, overriding the stale deployment-level version. Fixes DELETE and file-upload operations returning 404 due to wrong api-version. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(containers): pass params=None instead of params={} to httpx to preserve api-version httpx erases a URL's query-string when params={} (empty dict) is passed, silently stripping ?api-version=2025-04-01-preview from every container POST/DELETE request. Azure's GET endpoints tolerate a missing api-version; POST (upload) and DELETE are strict, so those returned 404. Fix: use `params or None` in container_handler._async_handle and llm_http_handler.async_container_delete_handler (and all sibling container handlers) so that an empty params dict falls back to None, leaving httpx to preserve the URL's existing query string intact. Adds a regression test that directly documents the httpx behaviour. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(router): remove elif model_id branch from _init_containers_api_endpoints Two reviewer findings addressed: 1. Truncated comment on the model_id fallback line — now complete. 2. Security: the elif branch that fired when container_id was absent allowed any authenticated caller to supply model_id in a POST /v1/containers body and route the request through an arbitrary deployment UUID, bypassing the model-level access checks that only validate `model`. Removed the elif branch; operations without container_id (create, list) route by the caller-supplied `model` field as before. model_id forwarding is kept only inside the container_id block, where the proxy ownership check has already validated the container before forwarding the deployment ID. Adds a regression test pinning the security boundary: no-container-id path calls original_function directly even when model_id is in kwargs. Co-authored-by: Cursor <cursoragent@cursor.com> * test(containers): validate proxy-to-router model_id forwarding for managed IDs Add test_regression_get_container_forwarding_params_sets_model_id_for_managed_id to verify that get_container_forwarding_params (the proxy-side half of the Azure routing fix) correctly extracts and forwards model_id from a LiteLLM-managed encoded container ID. This closes the gap identified by Greptile P1: the previous regression test only injected model_id as a direct kwarg, validating the router in isolation. The new test exercises the actual proxy-to-router data flow through ownership.get_container_forwarding_params, confirming that kwargs["model_id"] is populated before _init_containers_api_endpoints is reached. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(azure-containers): tighten endpoint-path strip to endswith match Use path.endswith() instead of path.find() for _AZURE_ENDPOINT_PATHS so the suffix strip only fires when api_base actually ends with one of the endpoint-specific path suffixes. This is the more precise check greptile flagged on the original find()-based implementation. * Fix sync container handler to preserve URL query string Mirror the async path fix: pass None instead of an empty params dict so httpx does not strip the URL's existing query string (e.g. ?api-version=...), which is required for Azure container routing. Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(azure-containers): strip trailing slash before endpoint suffix match Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(containers): recover model_id from stored encoded id for native Azure container IDs get_container_forwarding_params previously only set model_id when the user-supplied container_id was a LiteLLM-managed encoded id. For native upstream IDs (e.g. Azure 'cntr_<hex>') the decode fails and model_id was never forwarded — making the router-side fallback in _init_containers_api_endpoints unreachable in production. Fall back to the stored 'unified_object_id' on the ownership row, which is the encoded form captured at create time when the router selected a specific deployment. Decoding that yields the deployment model_id and restores router-based credential application (api_base, api_key) for retrieve/delete and container-file operations on native IDs. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Yassin Kortam <yassin@berri.ai> (cherry picked from commit 7f563b2)

…28395) * fix(proxy): expose Prisma idle/connect timeout + extra DB URL params Operators have reported large numbers of idle Prisma connections that never get closed. The proxy already forwards `connection_limit` and `pool_timeout` to the DATABASE_URL, but had no knob for capping idle or slow connections. Add three new `general_settings` keys that thread through to the DATABASE_URL / DIRECT_URL query string: - `database_connect_timeout` -> Prisma `connect_timeout` - `database_socket_timeout` -> Prisma `socket_timeout` (the main knob for closing idle connections from the LiteLLM side) - `database_extra_connection_params` -> untyped passthrough dict for any other Prisma URL param (`pgbouncer`, `statement_cache_size`, `sslmode`, ...); keys here override LiteLLM defaults. Refactors the duplicated DATABASE_URL/DIRECT_URL param dicts into a single `_build_db_connection_url_params` helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Update litellm/proxy/proxy_cli.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> (cherry picked from commit 2f9ac77)

…a client (#29983) (cherry picked from commit 3bd3951)

…r DB lookups (#29984) (cherry picked from commit dff25fe)

…r genuine auth failures (#29986) (cherry picked from commit da9d64b)

…30160) (cherry picked from commit 1828a7c)

…thropic streaming logging Targeted subset of staging commit cfcdf87 (#30202): only the anthropic_passthrough_logging_handler.py hardening hunks and their four tests are taken; the rest of that staging batch is intentionally excluded. (cherry picked from commit cfcdf87) (cherry picked from commit 973c7eb)

…combined view (#30327) The grace-period branch assigned the recursive get_data result (a finished LiteLLM_VerificationTokenView) back into the variable that the combined-view dict normalization then subscripts, raising TypeError on every request made with a rotated key inside its grace window; auth surfaced that as a 401. Return the recursive result directly instead. Regression test drives the full get_data flow: old hash misses the view, deprecated table resolves to the active token, and the call must return the view object (cherry picked from commit 5047eaf)

* chore(deps): bump aiohttp to 3.14.1 and vitest to 3.2.6 Lockfile-only bump for aiohttp (3.13.5 -> 3.14.1, within the existing pyproject constraint) and dashboard devDependency bumps for vitest, @vitest/coverage-v8, @vitest/ui (3.2.4 -> 3.2.6) plus transitive brace-expansion (5.0.5 -> 5.0.6). Clears the currently published advisories flagged by osv.dev against uv.lock and the dashboard lockfile. Verified: 154 custom_httpx unit tests and all 3943 dashboard vitest tests pass; live proxy completion and streaming calls succeed on the bumped venv * chore(deps): raise aiohttp floor to 3.14.0 The lockfile bump alone only protects environments built from uv.lock. Raising the pyproject floor extends the same minimum to package consumers installing litellm from PyPI, and prevents a future lockfile regeneration from resolving below 3.14.0 * Revert "chore(deps): raise aiohttp floor to 3.14.0" This reverts commit d6c1c9d. * revert(deps): roll back aiohttp to 3.13.5 vcrpy is incompatible with aiohttp >= 3.14 (the aiohttp_stubs module imports a symbol removed in 3.14) and the upstream fix is merged but unreleased, so every cassette-based test suite fails on 3.14. Hold aiohttp at 3.13.5 until a vcrpy release ships; the vitest and brace-expansion bumps stay * chore(deps): bump pypdf to 6.13.1 and tornado to 6.5.7 Lockfile-only bumps clearing the advisories published for both since this branch was opened * chore(deps): add regression guards for the bumped versions Raise the pypdf floor to 6.12.0 (direct dependency, applies to package consumers too) and add uv constraint-dependencies for the transitive pins: tornado >= 6.5.6, and aiohttp held in [3.13.5, 3.14) so a lockfile regeneration can neither fall back below the current version nor move onto 3.14 while vcrpy is incompatible. Constraints live in [tool.uv] and only affect this repo's resolution, not published metadata. Verified: uv lock -P with each out-of-range version fails to resolve; in-range resolutions unchanged (pypdf 6.13.1, tornado 6.5.7, aiohttp 3.13.5) (cherry picked from commit d96ab46) # Conflicts: # pyproject.toml # ui/litellm-dashboard/package-lock.json # ui/litellm-dashboard/package.json # uv.lock

CLAassistant · 2026-06-14T00:04:26Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ yuneng-berri
✅ Sameerlite
❌ yassin-berriai
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

yuneng-berri · 2026-06-14T00:04:27Z

@greptileai

greptile-apps · 2026-06-14T00:09:04Z

Greptile Summary

Backport of nine cherry-picks from the 1.84.8 patch set onto the stable/1.85.x line, plus a scoped dependency bump and the 1.85.6 version cut. All source commits were adapted where the staging-only test infrastructure or generated UI files were absent on this branch.

DB resilience (proxy/utils.py, auth_exception_handler.py, db/exception_handler.py, _types.py, proxy_cli.py): The Prisma cached-plan fallback now reconnects the client instead of injecting a cache-busting SQL comment; auth exceptions that indicate a DB infrastructure outage now surface as 503 instead of 401; new config knobs expose connect_timeout, socket_timeout, pgbouncer, and extra URL params.
Container routing (router.py, ownership.py, transformation.py, container_handler.py, llm_http_handler.py): Native Azure container IDs (opaque cntr_<hex> values) now carry the deployment model_id through retrieve/delete by recovering it from the stored unified_object_id; Azure api_base endpoint-path normalization and api-version extraction prevent the containers URL from being built against the wrong resource root; empty params dicts are coerced to None to preserve the URL's existing query string.
Anthropic passthrough (anthropic_passthrough_logging_handler.py): Adds a costing-model resolution fallback chain, extracts the model from streaming message_start chunks when the body model is "unknown", and skips [DONE] sentinels and non-JSON SSE frames to keep the logging pipeline stable with Anthropic-compatible upstreams.

Confidence Score: 4/5

This is a carefully assembled backport; all source fixes apply cleanly and the documented adaptations are sound. The two noted items are non-blocking.

The core fixes for Prisma cached-plan recovery, 503-on-DB-outage, deprecated-key lookup crash, Anthropic SSE logging, and container ID routing are all correct. The OSError catch in is_database_service_unavailable_error is broader than strictly necessary and could theoretically reclassify a non-DB OS error as 'database unavailable' during auth, but the auth flow rarely raises OS errors for non-DB reasons. The direct DB call added to _get_stored_container_id is consistent with the pre-existing _get_container_owner pattern in the same file and is shielded by a 60-second in-memory cache.

litellm/proxy/db/exception_handler.py (OSError breadth in is_database_service_unavailable_error) and litellm/proxy/container_endpoints/ownership.py (direct DB call in _get_stored_container_id).

Important Files Changed

Filename	Overview
litellm/proxy/utils.py	Two fixes: _query_first_with_cached_plan_fallback now reconnects the Prisma client (singleflight) instead of injecting a cache-busting comment; deprecated-key lookup now returns immediately via `deprecated_response` to avoid crashing dict-normalization code on a VerificationTokenView.
litellm/proxy/auth/auth_exception_handler.py	Adds a 503 branch for DB-infrastructure errors so they are no longer silently folded into the 401 fallthrough; the new check correctly runs after HTTPException/ProxyException guards are exhausted.
litellm/proxy/db/exception_handler.py	Adds is_prisma_engine_internal_error (traceback-walk heuristic) and is_database_service_unavailable_error (composite classifier); the OSError catch inside the latter is broad and could flag non-DB OS errors during auth as 503.
litellm/proxy/container_endpoints/ownership.py	Adds _CONTAINER_STORED_ID_CACHE and _get_stored_container_id to recover the deployment model_id for native upstream container IDs; makes a direct Prisma query on cache miss, following the same pattern as the pre-existing _get_container_owner.
litellm/proxy/pass_through_endpoints/llm_provider_handlers/anthropic_passthrough_logging_handler.py	Adds _resolve_costing_model (fallback chain: body model → deployment model → model_group) and _extract_model_from_anthropic_chunks (parses message_start SSE); skips [DONE] sentinels and non-JSON SSE frames to prevent logging pipeline errors.
litellm/proxy/proxy_cli.py	Extracts _build_db_connection_url_params helper; surfaces connect_timeout, socket_timeout, disable_prepared_statements, and extra_connection_params from general_settings into the Prisma DATABASE_URL / DIRECT_URL query string.
litellm/proxy/_types.py	Adds four new ConfigGeneralSettings fields: database_connect_timeout, database_socket_timeout, database_extra_connection_params, database_disable_prepared_statements.
litellm/llms/azure/containers/transformation.py	Adds _normalize_api_base (strips endpoint-specific path suffixes) and _extract_api_version (pulls api-version from query string); get_complete_url now uses the version from api_base instead of the deployment's api_version to prefer the newer containers API version.
litellm/router.py	Forwards model_id from kwargs as a fallback when the container_id decodes without a model_id, enabling deployment-credential lookup for native Azure container IDs.
litellm/llms/custom_httpx/container_handler.py	Converts empty-dict query_params to None before passing to httpx to prevent the empty dict from stripping the URL's existing query string (e.g., api-version).

_{Reviews (1): Last reviewed commit: "chore: refresh uv.lock for 1.85.6" | Re-trigger Greptile}

codecov · 2026-06-14T00:09:07Z

Codecov Report

❌ Patch coverage is 28.24427% with 94 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
..._handlers/anthropic_passthrough_logging_handler.py	10.00%	36 Missing ⚠️
litellm/proxy/proxy_cli.py	3.70%	26 Missing ⚠️
litellm/proxy/container_endpoints/ownership.py	14.81%	23 Missing ⚠️
litellm/proxy/db/exception_handler.py	75.86%	7 Missing ⚠️
litellm/llms/custom_httpx/container_handler.py	0.00%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-06-14T00:09:08Z

+        so a type-only check misses real outages. ``is_database_transport_error``
+        keyword-matches the connection message and catches that masquerade,
+        while genuine data errors (no connection keyword) correctly stay 401.
+
+        The Postgres "cached plan must not change result type" error is matched
+        here, not in ``is_database_transport_error``: it is a transient stale-DB-


Broad OSError catch may misclassify non-DB errors as 503

OSError is the base class for FileNotFoundError, PermissionError, BrokenPipeError, and others that have nothing to do with database connectivity. If any of these are raised during the auth flow for reasons unrelated to the DB (e.g. a missing config file loaded lazily, or a broken pipe on the client side), the auth handler will return 503 ("database temporarily unreachable") instead of the generic 401 fallthrough. Narrowing to ConnectionError and socket.timeout (both OSError subclasses) would keep the intent while avoiding false positives.

greptile-apps · 2026-06-14T00:09:09Z

+    cached = _CONTAINER_STORED_ID_CACHE.get_cache(model_object_id)
+    if cached == _NEGATIVE_STORED_ID_SENTINEL:
+        return None
+    if isinstance(cached, str) and cached:
+        return cached
+
+    prisma_client = await _get_prisma_client()
+    if prisma_client is None:
+        return None
+
+    row = await prisma_client.db.litellm_managedobjecttable.find_first(
+        where={
+            "model_object_id": model_object_id,
+            "file_purpose": CONTAINER_OBJECT_PURPOSE,
+        }
+    )
+    stored_id = getattr(row, "unified_object_id", None) if row is not None else None
+    _CONTAINER_STORED_ID_CACHE.set_cache(
+        model_object_id,
+        (
+            stored_id


Direct DB query outside the approved helper layer

_get_stored_container_id calls prisma_client.db.litellm_managedobjecttable.find_first(...) directly, bypassing the get_team/get_user/get_key helper pattern. This is consistent with the pre-existing _get_container_owner in the same file, which already does the same thing — so this doesn't regress anything — but it means both functions carry DB queries directly in the container request path. The 60-second TTL cache mitigates frequency, but the pattern is worth noting if the ownership module is audited for the no-direct-DB rule later.

Rule Used: What: In critical path of request, there should be... (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-06-14T00:10:35Z

Greptile Summary

This is a carefully documented backport of nine cherry-picks from litellm_internal_staging onto the stable/1.85.x line, cutting version 1.85.6. Every source change has a matching set of mock-only regression tests, and the adaptation notes in the PR description accurately account for each delta from the staging commits.

Auth fix (503 vs 401): DB infrastructure failures during key lookup now surface as HTTP 503 (no_db_connection) instead of falling through to the generic 401 path; genuine auth failures (missing/invalid key) continue to produce 401. A companion fix in utils.py correctly returns the deprecated-key lookup result directly rather than falling through to dict normalization that would crash subscripting a LiteLLM_VerificationTokenView.
Prisma reliability: Cached-plan errors now trigger a singleflight attempt_db_reconnect + identical-query retry instead of injecting a unique comment that would defeat PostgreSQL's plan cache on every call; new config knobs (database_connect_timeout, database_socket_timeout, database_disable_prepared_statements, database_extra_connection_params) expose Prisma URL params without breaking existing defaults.
Container + passthrough fixes: Azure container URL construction is hardened against api_base values that carry endpoint-specific path suffixes, empty-dict params arguments are replaced with None to preserve existing query strings, get_container_forwarding_params is made async and correctly awaited at all five call sites, and the Anthropic passthrough logging handler now skips [DONE] sentinels and non-JSON SSE frames while resolving the "unknown" model sentinel for correct cost attribution.

Confidence Score: 5/5

Clean backport; all source changes are well-scoped bug fixes with comprehensive regression tests, no previously-passing assertions were weakened, and all adaptations are documented and symbol-verified.

Every changed symbol was verified to exist on this branch, all five get_container_forwarding_params call sites were updated to await the now-async function, and the 503/401 exception ordering in the auth handler is correct. No new network calls in tests, no hardcoded model flags, and no FastAPI imports outside the proxy directory.

No files require special attention. litellm/proxy/db/exception_handler.py contains the most complex new logic (traceback-walking in is_prisma_engine_internal_error), but it is thoroughly covered by the new parametrized tests including the exact prisma engine crash scenario it guards against.

Important Files Changed

Filename	Overview
litellm/proxy/utils.py	Two fixes: `_query_first_with_cached_plan_fallback` now reconnects the Prisma client on a cached-plan error (instead of injecting a unique comment that defeated plan caching on every call), and the deprecated-key lookup correctly returns `deprecated_response` directly instead of falling through to dict normalization that would crash subscripting a `LiteLLM_VerificationTokenView`.
litellm/proxy/auth/auth_exception_handler.py	Adds a `PrismaDBExceptionHandler.is_database_service_unavailable_error` guard after the ProxyException re-raise; DB infrastructure failures now surface as 503 instead of falling through to the generic 401 path. The ordering is correct: ProxyExceptions (including legitimate 401s) are re-raised first.
litellm/proxy/db/exception_handler.py	Adds `is_prisma_engine_internal_error` (walks the traceback to detect non-PrismaError exceptions from prisma.engine, e.g. AttributeError from malformed error payloads during teardown) and `is_database_service_unavailable_error` (composites connection/transport/engine checks plus the cached-plan case). Data-layer errors are correctly excluded via the explicit exclusion list in `is_database_connection_error`.
litellm/proxy/proxy_cli.py	Extracts `_build_db_connection_url_params` helper and adds `database_connect_timeout`, `database_socket_timeout`, `database_disable_prepared_statements`, `database_extra_connection_params` config fields. The new params are propagated correctly to both `DATABASE_URL` and `DIRECT_URL`.
litellm/proxy/container_endpoints/ownership.py	Makes `get_container_forwarding_params` async and adds a `_CONTAINER_STORED_ID_CACHE` (in-memory, TTL=60) plus `_get_stored_container_id` to recover the deployment `model_id` from the stored `unified_object_id` for native upstream IDs that carry no LiteLLM routing payload.
litellm/llms/azure/containers/transformation.py	Adds `_normalize_api_base` (strips `/openai/responses` suffix) and `_extract_api_version` (pulls `api-version` from the URL query string), then uses both in `get_complete_url` so the containers URL is built from the resource root with the correct API version.
litellm/router.py	Falls back to the `model_id` forwarded by the proxy (`kwargs["model_id"]`) when the container_id is a native upstream ID that carries no LiteLLM routing payload; `.strip()` guards against whitespace-only strings.
litellm/proxy/pass_through_endpoints/llm_provider_handlers/anthropic_passthrough_logging_handler.py	Three additions: `_resolve_costing_model` resolves "unknown" model sentinel via litellm_params then model_group; `_extract_model_from_anthropic_chunks` recovers the real model from the `message_start` SSE event; `[DONE]` sentinel and `json.JSONDecodeError` are now skipped in the streaming accumulator loop.
litellm/proxy/container_endpoints/endpoints.py	Adds `await` to the two `get_container_forwarding_params` call sites (retrieve and delete container) that were missing it after ownership.py made the function async.
litellm/proxy/container_endpoints/handler_factory.py	Adds `await` to three more `get_container_forwarding_params` call sites in binary, multipart-upload, and generic request handlers.
litellm/llms/custom_httpx/container_handler.py	Changes all httpx calls to use `params=effective_params` where `effective_params = query_params or None`, preventing an empty dict from stripping an existing query string from the URL.
litellm/llms/custom_httpx/llm_http_handler.py	Applies the same `params or None` guard to all ten container-related httpx GET/DELETE call sites in the LLM HTTP handler.
litellm/proxy/_types.py	Adds four new optional fields to `ConfigGeneralSettings` for the DB URL param expansion: `database_connect_timeout`, `database_socket_timeout`, `database_extra_connection_params`, `database_disable_prepared_statements`.
pyproject.toml	Version bumped to 1.85.6; pypdf floor raised to 6.13.1; tornado and aiohttp constraint-dependency pins added.

_{Reviews (2): Last reviewed commit: "chore: refresh uv.lock for 1.85.6" | Re-trigger Greptile}

Sameerlite and others added 11 commits June 13, 2026 16:02

fix(proxy): recover from cached-plan errors by reconnecting the Prism…

3638965

…a client (#29983) (cherry picked from commit 3bd3951)

feat(proxy): add option to disable server-side prepared statements fo…

dfa6a74

…r DB lookups (#29984) (cherry picked from commit dff25fe)

fix(proxy): return 5xx on DB infra errors during auth; reserve 401 fo…

2cbcfcb

…r genuine auth failures (#29986) (cherry picked from commit da9d64b)

fix(passthrough): resolve costing model when body model is unknown (#…

f2da26e

…30160) (cherry picked from commit 1828a7c)

bump: version 1.85.5 → 1.85.6

0c02e01

chore: refresh uv.lock for 1.85.6

b0c0177

yuneng-berri requested a review from a team June 14, 2026 00:04

greptile-apps Bot reviewed Jun 14, 2026

View reviewed changes

ryan-crabbe-berri approved these changes Jun 14, 2026

View reviewed changes

yuneng-berri merged commit c857306 into stable/1.85.x Jun 14, 2026
69 of 75 checks passed

yuneng-berri deleted the litellm_backport_1_85_x_1848_ports branch June 14, 2026 00:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(release): backport the 1.84.8 patch set + deps bump to stable/1.85.x and cut 1.85.6#30404

chore(release): backport the 1.84.8 patch set + deps bump to stable/1.85.x and cut 1.85.6#30404
yuneng-berri merged 11 commits into
stable/1.85.xfrom
litellm_backport_1_85_x_1848_ports

yuneng-berri commented Jun 14, 2026

Uh oh!

CLAassistant commented Jun 14, 2026 •

edited

Loading

Uh oh!

yuneng-berri commented Jun 14, 2026

Uh oh!

greptile-apps Bot commented Jun 14, 2026

Important Files Changed

Uh oh!

codecov Bot commented Jun 14, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot Jun 14, 2026

Uh oh!

greptile-apps Bot Jun 14, 2026

Uh oh!

greptile-apps Bot commented Jun 14, 2026

Important Files Changed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

yuneng-berri commented Jun 14, 2026

Relevant issues

What is included

Adaptation notes

Known noise on this line

Pre-Submission checklist

Type

Changes

Screenshots / Proof of Fix

Uh oh!

CLAassistant commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuneng-berri commented Jun 14, 2026

Uh oh!

greptile-apps Bot commented Jun 14, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

codecov Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Jun 14, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CLAassistant commented Jun 14, 2026 •

edited

Loading

codecov Bot commented Jun 14, 2026 •

edited

Loading