merge main by Sameerlite · Pull Request #28581 · BerriAI/litellm

Sameerlite · 2026-05-22T11:46:04Z

Relevant issues

Linear ticket

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Screenshots / Proof of Fix

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

Note

Medium Risk
Medium risk because it changes proxy runtime startup options (new --run_granian path) and expands guardrail telemetry emitted on failures, which could affect production server behavior and observability payloads.

Overview
Proxy runtime: Adds a new litellm proxy launch mode using Granian via --run_granian (plus --granian_threads), wires in SSL handling for Granian, and pins granian==2.5.7 in proxy extras; the non-root Docker image now installs npm alongside nodejs.

Guardrail observability: Extends standard guardrail logging/tracing to surface guardrail_status, provider guardrail_action, and pre-redaction violation_categories, and ensures OpenTelemetry emits guardrail spans even for pre-call guardrail blocks by pulling guardrail info from request_data.metadata.

CI/tests/maintenance: Forwards LITELLM_LICENSE into CircleCI E2E runs to exercise premium flows, adds a large authz behavior matrix suite for /team/* endpoints (plus improved seeding/cleanup helpers), bumps dev black to 26.3.1 with liccheck allowlist updates, and includes minor dependency/formatting churn (e.g. package-lock.json, trailing commas/whitespace).

^{Reviewed by Cursor Bugbot for commit d04373f. Bugbot is set up for automated code reviews on this repo. Configure here.}

…n pre-call blocks (#28364) - Fix missing guardrail child spans when a pre-call guardrail blocks the request before reaching the LLM provider; `async_post_call_failure_hook` now calls `_emit_guardrail_spans_from_request_data` to emit spans from `request_data["metadata"]` regardless of whether `_handle_failure` already fired - Add `guardrail_status`, `guardrail_action`, and `guardrail_violation_categories` as queryable top-level OTEL span attributes so trace backends can filter/group by violation type without parsing the redacted `guardrail_response` blob - Introduce `_emit_guardrail_spans_from_request_data` helper that constructs minimal kwargs from `request_data["metadata"]` and routes through `_create_guardrail_span`, sharing the same dedupe state to prevent double-emitting when both failure hooks fire - Extend `BedrockGuardrail` with `_build_tracing_detail` and `_extract_violation_category_names` which flatten BLOCKED assessments into human-readable category labels (topic names, content-filter types, PII entity types, named regex names) before redaction, and surface Bedrock's raw `action` field via `tracing_detail` - Security: violation category extraction deliberately omits `customWords.match` and unnamed regex `match` values because those fields carry the user-submitted content that triggered the rule; only operator-defined `name`/`type` labels are emitted - Add `violation_categories` and `guardrail_action` fields to `StandardLoggingGuardrailInformation` and `GuardrailTracingDetail` TypedDicts to carry the pre-redaction metadata through the logging pipeline - Add comprehensive test suite covering: guardrail span creation on failure, dedupe between `_handle_failure` and `async_post_call_failure_hook`, per-span status attributes for multi-guardrail sequences, Bedrock category extraction for all policy types, security leak prevention, and end-to-end `CustomGuardrail` violation path Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu>

…28441) * test(proxy): behavior-pinning matrix for team management endpoints PR2 (Team Tier-1) of the management-endpoint behavior-pinning effort. Extends the tests/proxy_behavior/management/ harness PR1 built and adds the actor x target-resource authz matrix for the 7 team endpoints: /team/new, /team/info, /team/list, /team/update, /team/member_add, /team/member_delete, /team/member_update. Tests-only, no production code changes. Harness extensions: - actors.py: ORG_B_ADMIN actor (org admin of ORG_B) and TEAM_GAMMA (an ORG_A team with no actor members), so team-targeting endpoints get a clean own / same-org-other / cross-org target axis. - conftest.py: create_scratch_team() raw-seeds target teams without /team/new side effects; the scratch teardown now also strips dangling scratch-team refs from LiteLLM_UserTable.teams. 156 new scenarios; status codes pinned to observed handler behavior. * test(proxy): record mutmut run blockers in PR2 triage doc Attempted a scoped local mutmut run for G5; it did not complete. Record the three concrete blockers in mutmut_triage/pr2-team-tier1.md so the next attempt has a head start: 1. mutmut's mutants/ sandbox is import-shadowed by the worktree source. 2. the legacy mock suite and the real-DB behavior suite cannot share a pytest session (mock suite globally patches prisma_client). 3. the CI mutation-test.yml workflow starts no Postgres, so its stats phase now aborts on the behavior-suite tests PR1 added to tests_dir. mutmut stays a deferred follow-up (as in PR1); the binding pre-merge signal remains the behavior matrix (G1) and the G4 regression-replay. * test(proxy): drop suite README + triage doc, trim test comments Remove the two prose docs from the behavior suite (README.md and mutmut_triage/pr2-team-tier1.md) and tighten the comment blocks on the team test files + harness down to the load-bearing parts (the gate each matrix pins, plus genuinely surprising results). No behavior change — all 286 scenarios still pass. * test(proxy): remove mutmut tests_dir comment

…#28503) test_gemini_google_maps_tool_simple makes live calls to Vertex AI's Google Maps grounding backend, which intermittently returns 500 INTERNAL ("Please retry") — a transient Google-side failure, not a LiteLLM bug. The request LiteLLM emits matches Google's published googleMaps grounding spec field-for-field, and the maps-platform 500 only occurs after Vertex accepts the request. The test already passes on RateLimitError; treat InternalServerError the same way so transient Vertex-side failures don't fail CI.

The non_root builder stage installs `nodejs` but not `npm`. Without `npm` on PATH, prisma-python falls back to downloading a Node runtime via nodeenv from nodejs.org, and that downloaded binary fails to load `libatomic.so.1` — breaking `prisma generate` and the image build. `npm` was dropped from this apk list in ca52e34. Restoring it lets prisma-python use the system Node + npm, matching docker/Dockerfile which already installs `npm` for the same reason.

…#27665) (#28524) Bumps [next](https://github.com/vercel/next.js) from 16.2.4 to 16.2.6. - [Release notes](https://github.com/vercel/next.js/releases) - [Changelog](https://github.com/vercel/next.js/blob/canary/release.js) - [Commits](vercel/next.js@v16.2.4...v16.2.6) --- updated-dependencies: - dependency-name: next dependency-version: 16.2.6 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps-dev): bump black 24.10.0 -> 26.3.1 * style: apply black 26.3.1 formatting * chore: authorize black 26.3.1 license in liccheck.ini

* build(deps): bump next from 16.2.4 to 16.2.6 in /ui/litellm-dashboard (#27665) Bumps [next](https://github.com/vercel/next.js) from 16.2.4 to 16.2.6. - [Release notes](https://github.com/vercel/next.js/releases) - [Changelog](https://github.com/vercel/next.js/blob/canary/release.js) - [Commits](vercel/next.js@v16.2.4...v16.2.6) --- updated-dependencies: - dependency-name: next dependency-version: 16.2.6 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * build(deps): bump protobufjs in /tests/pass_through_tests (#28296) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.5.6 to 7.6.0. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/protobufjs-v7.6.0/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.5.6...protobufjs-v7.6.0) --- updated-dependencies: - dependency-name: protobufjs dependency-version: 7.6.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * build(deps): bump ws from 8.20.0 to 8.20.1 in /tests/pass_through_tests (#28303) Bumps [ws](https://github.com/websockets/ws) from 8.20.0 to 8.20.1. - [Release notes](https://github.com/websockets/ws/releases) - [Commits](websockets/ws@8.20.0...8.20.1) --- updated-dependencies: - dependency-name: ws dependency-version: 8.20.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* test(e2e): forward LITELLM_LICENSE to UI e2e proxy The UI e2e job ran without LITELLM_LICENSE, so premium_user was always false in the issued login JWT and premium-gated UI surfaces (Team-BYOK Model switch, etc.) couldn't be driven through the UI. Forward the env var from run_e2e.sh and the CircleCI e2e_ui_testing job, and add a sanity test that decodes the admin storage state token and asserts premium_user=true so the wiring fails loudly if it ever regresses. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Update ui/litellm-dashboard/e2e_tests/tests/proxy-admin/license.spec.ts Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

…t stability, (#26027) * Add granian as a ASGI compliant web server. Provides better stability, 10-20 RPS improvement under standard LT conditions. TODO: Verify poetry lock details and add locust numbers to PR * Update granian version in license_cache.json and pyproject.toml to 2.5.7 * Enhance proxy CLI tests by adding SSL initialization checks for Granian server. Remove Python version skip conditions and implement tests to ensure SSL certificate and key are required for server initialization. * update uv lock to fix granian import error

greptile-apps · 2026-05-22T11:46:09Z

Too many files changed for review. (116 files found, 100 file limit)

CLAassistant · 2026-05-22T11:46:14Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
3 out of 4 committers have signed the CLA.

✅ ryan-crabbe-berri
✅ yuneng-berri
✅ harish-berri
❌ yassin-berriai
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

codecov · 2026-05-22T11:49:44Z

Codecov Report

❌ Patch coverage is 81.17647% with 16 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
litellm/proxy/proxy_cli.py	69.44%	11 Missing ⚠️
litellm/litellm_core_utils/litellm_logging.py	0.00%	2 Missing ⚠️
litellm/proxy/proxy_server.py	0.00%	1 Missing ⚠️
...proxy/spend_tracking/spend_management_endpoints.py	0.00%	1 Missing ⚠️
litellm/proxy/utils.py	50.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

cursor

Cursor Bugbot has reviewed your changes using high effort and found 2 potential issues.

Bugbot Autofix prepared fixes for both issues found in the latest run.

✅ Fixed: Unreachable exception handler makes pytest.skip dead code
- Removed InternalServerError from the first except tuple so the dedicated handler runs and pytest.skip is reachable again for transient Google Maps 500s.
✅ Fixed: Granian path unconditionally calls uvicorn init helper
- Included run_granian in the running_uvicorn check and moved _get_default_unvicorn_init_args (which imports uvicorn) inside the running_uvicorn branch so Granian-only installs no longer hit ImportError.

Preview (d04373f4ce)

diff --git a/litellm/proxy/proxy_cli.py b/litellm/proxy/proxy_cli.py
--- a/litellm/proxy/proxy_cli.py
+++ b/litellm/proxy/proxy_cli.py
@@ -1191,20 +1191,23 @@
             )
             return
 
-        running_uvicorn = run_gunicorn is False and run_hypercorn is False
-        uvicorn_args = ProxyInitializationHelpers._get_default_unvicorn_init_args(
-            host=host,
-            port=port,
-            log_config=log_config,
-            keepalive_timeout=keepalive_timeout,
-            timeout_worker_healthcheck=(
-                timeout_worker_healthcheck if running_uvicorn else None
-            ),
+        running_uvicorn = (
+            run_gunicorn is False and run_hypercorn is False and run_granian is False
         )
-        # Optional: recycle uvicorn workers after N requests
-        if max_requests_before_restart is not None:
-            uvicorn_args["limit_max_requests"] = max_requests_before_restart
-        if run_gunicorn is False and run_hypercorn is False and run_granian is False:
+        if running_uvicorn:
+            # Only build uvicorn args (which imports uvicorn) when actually
+            # running uvicorn — Granian/Hypercorn/Gunicorn users may not have
+            # uvicorn installed.
+            uvicorn_args = ProxyInitializationHelpers._get_default_unvicorn_init_args(
+                host=host,
+                port=port,
+                log_config=log_config,
+                keepalive_timeout=keepalive_timeout,
+                timeout_worker_healthcheck=timeout_worker_healthcheck,
+            )
+            # Optional: recycle uvicorn workers after N requests
+            if max_requests_before_restart is not None:
+                uvicorn_args["limit_max_requests"] = max_requests_before_restart
             if ssl_certfile_path is not None and ssl_keyfile_path is not None:
                 print(  # noqa
                     f"\033[1;32mLiteLLM Proxy: Using SSL with certfile: {ssl_certfile_path} and keyfile: {ssl_keyfile_path}\033[0m\n"  # noqa

diff --git a/tests/local_testing/test_amazing_vertex_completion.py b/tests/local_testing/test_amazing_vertex_completion.py
--- a/tests/local_testing/test_amazing_vertex_completion.py
+++ b/tests/local_testing/test_amazing_vertex_completion.py
@@ -38,7 +38,6 @@
 )
 from litellm.llms.vertex_ai.vertex_llm_base import VertexBase
 
-
 litellm.num_retries = 3
 litellm.cache = None
 user_message = "Write a short poem about the sky"
@@ -1104,9 +1103,7 @@
             {
                 "content": {
                     "role": "model",
-                    "parts": [
-                        {
-                            "text": """{
+                    "parts": [{"text": """{
                             "recipes": [
                                 {"recipe_name": "Chocolate Chip Cookies"},
                                 {"recipe_name": "Oatmeal Raisin Cookies"},
@@ -1114,9 +1111,7 @@
                                 {"recipe_name": "Sugar Cookies"},
                                 {"recipe_name": "Snickerdoodles"}
                             ]
-                            }"""
-                        }
-                    ],
+                            }"""}],
                 },
                 "finishReason": "STOP",
                 "safetyRatings": [
@@ -4223,11 +4218,11 @@
             )
         print(f"Response: {response.model_dump_json(indent=4)}")
         assert response.choices[0].message.content is not None
-    except (litellm.RateLimitError, litellm.InternalServerError):
-        # Transient Vertex-side failures (rate limiting, 500 INTERNAL from the
-        # Google Maps grounding backend) are not LiteLLM bugs — don't fail CI.
+    except litellm.RateLimitError:
         pass
     except litellm.InternalServerError:
+        # Transient 500 INTERNAL from the Google Maps grounding backend is not a
+        # LiteLLM bug — skip (rather than pass) so CI still surfaces upstream flakes.
         pytest.skip(
             "Google Maps Platform returned a transient 500 (upstream flake); skipping."
         )

_{You can send follow-ups to the cloud agent here.}

^{Reviewed by Cursor Bugbot for commit d04373f. Configure here.}

cursor · 2026-05-22T11:50:02Z

-    except litellm.RateLimitError:
+    except (litellm.RateLimitError, litellm.InternalServerError):
+        # Transient Vertex-side failures (rate limiting, 500 INTERNAL from the
+        # Google Maps grounding backend) are not LiteLLM bugs — don't fail CI.


Unreachable exception handler makes pytest.skip dead code

Medium Severity

The except (RateLimitError, InternalServerError) clause silently catches InternalServerError, making the subsequent except InternalServerError block unreachable. This means transient 500s are now silently ignored instead of causing a pytest.skip, reducing CI visibility for upstream flakes.

^{Reviewed by Cursor Bugbot for commit d04373f. Configure here.}

cursor · 2026-05-22T11:50:02Z

        if max_requests_before_restart is not None:
            uvicorn_args["limit_max_requests"] = max_requests_before_restart
-        if run_gunicorn is False and run_hypercorn is False:
+        if run_gunicorn is False and run_hypercorn is False and run_granian is False:


Granian path unconditionally calls uvicorn init helper

Medium Severity

running_uvicorn is computed as run_gunicorn is False and run_hypercorn is False without checking run_granian, so it's incorrectly True when using Granian. More importantly, _get_default_unvicorn_init_args() is called unconditionally (line 1195), which internally does import uvicorn. If a user installs only granian (as the error message on line 911 suggests is valid with pip install granian), this crashes with an ImportError before reaching the Granian branch.

^{Reviewed by Cursor Bugbot for commit d04373f. Configure here.}

yassin-berriai and others added 9 commits May 21, 2026 15:49

build(deps-dev): bump black to 26.3.1 and apply formatting (#28525)

2a5dfcd

* build(deps-dev): bump black 24.10.0 -> 26.3.1 * style: apply black 26.3.1 formatting * chore: authorize black 26.3.1 license in liccheck.ini

Sameerlite merged commit dbe13bb into litellm_fix_azure_priority_pricing_catalog May 22, 2026
121 of 179 checks passed

cursor Bot reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main#28581

merge main#28581
Sameerlite merged 9 commits into
litellm_fix_azure_priority_pricing_catalogfrom
litellm_internal_staging

Sameerlite commented May 22, 2026 •

edited by cursor Bot

Loading

Uh oh!

greptile-apps Bot commented May 22, 2026

Uh oh!

Uh oh!

CLAassistant commented May 22, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 22, 2026 •

edited

Loading

Uh oh!

cursor Bot left a comment •

edited

Loading

Uh oh!

cursor Bot May 22, 2026

Uh oh!

cursor Bot May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

Sameerlite commented May 22, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant issues

Linear ticket

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Screenshots / Proof of Fix

Type

Changes

Uh oh!

greptile-apps Bot commented May 22, 2026

Uh oh!

Uh oh!

CLAassistant commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 22, 2026

Choose a reason for hiding this comment

Unreachable exception handler makes pytest.skip dead code

Uh oh!

cursor Bot May 22, 2026

Choose a reason for hiding this comment

Granian path unconditionally calls uvicorn init helper

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Sameerlite commented May 22, 2026 •

edited by cursor Bot

Loading

CLAassistant commented May 22, 2026 •

edited

Loading

codecov Bot commented May 22, 2026 •

edited

Loading

cursor Bot left a comment •

edited

Loading