chore(release): backport five staged fixes into stable/1.87.x and cut 1.87.1 by mateo-berri · Pull Request #29631 · BerriAI/litellm

mateo-berri · 2026-06-03T22:57:34Z

Relevant issues

Backports five fixes that already merged on litellm_internal_staging but never reached the 1.87.x line. Each one is cherry-picked from the squashed commit that landed for its PR. This closes that gap and cuts 1.87.1

Linear ticket

N/A

What is included

Cherry-picked in merge order:

fix(azure): preserve AD token refresh in v1 OpenAI client path #28627 fix(azure): preserve AD token refresh in the v1 OpenAI client path
fix(proxy): map stripped batch body.model to proxy alias for auth #29264 fix(proxy): map a stripped batch body.model back to the proxy alias so key access checks pass
fix(proxy): resolve managed video model ids for auth #29545 fix(proxy): resolve managed video model ids through the router before auth, budget, and key checks
fix(key_generate): allow team members to create keys on org-scoped teams #29310 fix(key_generate): let team members create keys on org-scoped teams (regression since v1.84.0-rc.1)
fix(vertex): strip output_config.effort for Vertex Claude models that reject it (Haiku 4.5) #29585 fix(vertex): strip output_config.effort for Vertex Claude models that reject it, such as Haiku 4.5

The last two commits are the version bump (1.87.0 to 1.87.1) and the matching uv.lock refresh

#29598 (passthrough duplicate logs) is deliberately excluded from this line. The fix works by setting the logging object stream flag so the streaming dedup guard fires, but that guard (has_dispatched_final_stream_success and _is_assembled_stream_success in litellm_logging.py) does not exist on 1.87.x, which branched separately from 1.86.x and carries a different passthrough logging path. Cherry-picking it here would be a no-op at best, so it is held back rather than applied blindly. The other four lines that do carry the guard (1.84.x, 1.85.x, 1.86.x, and the 1.88 rc) receive it

Pre-Submission checklist

The cherry-picked PRs each carry their own tests
My PR passes all unit tests on make test-unit
Scope is limited to backporting already-merged fixes plus the release bump
Greptile review requested

CI (LiteLLM team)

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Screenshots / Proof of Fix

These are backports of changes already merged, released, and validated on other lines; each linked PR carries its own proof and review

Type

Bug Fix
Infrastructure

Changes

See the commit list above. No new code beyond the cherry-picks, the version bump, and the lockfile refresh

* fix(azure): preserve AD token refresh in v1 OpenAI client path The /openai/v1/ code path (api_version in {"v1", "latest", "preview"}) constructs a plain OpenAI/AsyncOpenAI client, but only forwarded `api_key` from `azure_client_params`. When `enable_azure_ad_token_refresh` is set (or any AD-only auth), `api_key` is None and the client constructor raised "The api_key client option must be set...", breaking every Azure call with a v1 api_version. The OpenAI SDK (>=2.20.0) accepts a callable for `api_key` and re-invokes it on every request via `_refresh_api_key`, so we now forward `azure_ad_token_provider` directly — preserving the per-request token refresh behavior of the regular AzureOpenAI client and avoiding the expiry hole that resolving the token once at client-creation time would introduce. Static `azure_ad_token` strings fall through to `api_key`. For the async path we wrap the sync provider returned by azure-identity in an async function since AsyncOpenAI expects `Callable[[], Awaitable[str]]`. Fixes #27945 https://claude.ai/code/session_01UnzrDSFUUgp5T2wRoPMxq5 * fix(azure): offload sync token provider to thread in v1 async wrapper * fix(azure): include AD credential identity in v1 client cache key --------- Co-authored-by: Claude <noreply@anthropic.com> (cherry picked from commit 96a2e8b)

…9264) * fix(proxy): map stripped batch body.model to proxy alias for auth replace_model_in_jsonl rewrites JSONL body.model to the provider id before upload; batch file access checks must resolve that id back to model_name so keys granted the proxy alias are not rejected with 403. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(proxy): surface resolved proxy alias in batch file 403 detail --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: mateo-berri <277851410+mateo-berri@users.noreply.github.com> (cherry picked from commit 70d2748)

* fix(proxy): resolve managed video model ids for auth Co-authored-by: Cursor <cursoragent@cursor.com> * test(proxy): cover character_id router model resolution Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> (cherry picked from commit d45e9e4)

…ams (#29310) * fix(key_generate): allow team members to create keys on org-scoped teams When a virtual key is created for a team, enterprise logic inherits the team's organization_id onto the key (add_team_organization_id). Since the VERIA-55 org-IDOR fix, /key/generate then required the caller to be an explicit LiteLLM_OrganizationMembership member of that org, returning 403 "Caller is not a member of organization_id=<uuid>". Admins normally only add users to teams (not orgs), so self-serve key creation regressed for any user on an org-scoped team (regression since v1.84.0-rc.1). Skip the org-membership check when organization_id was inherited from the key's team (organization_id == team_table.organization_id). Team-level authorization already gates this path, so team membership is sufficient. The membership check still runs when a caller assigns an organization_id that did not come from the key's team, preserving the IDOR protection. Adds regression tests covering both the team-inherited (allowed) and foreign-org (still blocked) cases. Co-authored-by: Cursor <cursoragent@cursor.com> * test(key_generate): cover mismatched team org IDOR path on generate Add test_generate_key_foreign_org_with_mismatched_team_still_enforces_membership for the case where a team is present but request organization_id differs from team_table.organization_id. Enterprise inheritance is no-op'd in the test so the guard is exercised directly; membership validation must still run. Addresses Greptile review on #29310. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> (cherry picked from commit b11833c)

… reject it (Haiku 4.5) (#29585) * fix(vertex): strip output_config.effort for models that reject it Haiku 4.5 on Vertex AI does not support output_config.effort and 400s with "output_config.effort: Extra inputs are not permitted". PR #27074 emptied VERTEX_UNSUPPORTED_OUTPUT_CONFIG_KEYS so effort would forward for Opus/Sonnet 4.6+, but that made the strip unconditional across every Vertex Anthropic model, including ones that don't support it. Claude Code injects effort into its default Messages payload, so `claude --model claude-haiku-4.5` started failing. Make the sanitizer model-aware: drop output_config.effort for models that don't advertise output_config support (or any reasoning effort level) while forwarding it for those that do. The fix covers both the chat-completion and Messages pass-through transformation paths since they share the helper. * chore(vertex): log at debug when dropping unsupported output_config.effort Operators pointing an unregistered Vertex Claude alias that does support effort would otherwise see it stripped with no signal. Debug level keeps it out of normal logs since Claude Code sends effort on every request. (cherry picked from commit cc55662)

codecov · 2026-06-03T23:00:11Z

Codecov Report

❌ Patch coverage is 28.26087% with 33 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
litellm/proxy/auth/auth_utils.py	10.00%	9 Missing ⚠️
...ai_partner_models/anthropic/output_params_utils.py	20.00%	8 Missing ⚠️
litellm/llms/azure/common_utils.py	56.25%	7 Missing ⚠️
litellm/proxy/hooks/batch_rate_limiter.py	0.00%	5 Missing ⚠️
...y/management_endpoints/key_management_endpoints.py	0.00%	2 Missing ⚠️
...rtex_ai_partner_models/anthropic/transformation.py	0.00%	1 Missing ⚠️
litellm/proxy/spend_tracking/budget_reservation.py	50.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-06-03T23:25:09Z

Greptile Summary

This release backport cherry-picks five targeted bug fixes from litellm_internal_staging into the stable/1.87.x line and cuts version 1.87.1. All five fixes have already been validated upstream and each carries its own tests.

Azure (fix(azure): preserve AD token refresh in v1 OpenAI client path #28627): Preserves AD token refresh on the v1 OpenAI client path by passing azure_ad_token_provider (or azure_ad_token) as api_key; async provider is wrapped in asyncio.to_thread to avoid blocking the event loop. Cache-key derivation is updated to produce a stable, serializable string for callables instead of relying on the raw function repr.
Proxy batch auth (fix(proxy): map stripped batch body.model to proxy alias for auth #29264): Maps the stripped body.model field (a provider ID after replace_model_in_jsonl) back to the proxy model_name via the router before can_key_call_model runs, fixing false-deny errors for batch requests.
Managed video model auth (fix(proxy): resolve managed video model ids for auth #29545): Threads llm_router through the managed-resource-ID extraction chain so video model IDs are resolved to proxy aliases before auth, budget, and key checks.
Key generation regression (fix(key_generate): allow team members to create keys on org-scoped teams #29310): Skips the org-membership validation when data.organization_id was inherited from the caller's team, unblocking team members from creating keys on org-scoped teams.
Vertex Haiku effort param (fix(vertex): strip output_config.effort for Vertex Claude models that reject it (Haiku 4.5) #29585): Drops output_config.effort for Vertex Claude models that advertise no effort-level support (e.g. Haiku 4.5) while forwarding it for those that do (Opus/Sonnet 4.6+); gating delegates to AnthropicConfig._model_supports_effort_param which reads the model cost map rather than hardcoding model names.

Confidence Score: 4/5

All five cherry-picks are narrow, well-tested fixes with no structural changes; safe to merge as a patch release.

Each fix is a direct backport of an already-validated change. The auth and key-generation paths are touched, but the scope is limited — passing a router reference through existing call chains, and adding a single bypass condition tightly gated by team membership. Tests cover the happy path, the no-team case, and the mismatched-team-org case. The org-membership bypass in key_management_endpoints.py is logically sound and well-covered, sitting in a security-sensitive path but with no failing invariants found.

litellm/proxy/management_endpoints/key_management_endpoints.py — the new org-membership bypass; litellm/llms/azure/common_utils.py — the AD token provider cache-key construction and async wrapping.

Important Files Changed

Filename	Overview
litellm/llms/azure/common_utils.py	Preserves AD token refresh for v1 OpenAI client path; adds stable cache-key derivation for callables and hashes sensitive credential strings
litellm/llms/vertex_ai/vertex_ai_partner_models/anthropic/output_params_utils.py	Adds per-model gating for output_config.effort; delegates to AnthropicConfig._model_supports_effort_param which reads from model cost map, not hardcoded values
litellm/proxy/management_endpoints/key_management_endpoints.py	Fixes regression where team members on org-scoped teams could not generate keys; adds _org_inherited_from_team bypass for the org membership check
litellm/proxy/auth/auth_utils.py	Threads llm_router into managed resource ID resolution so video/character model IDs are resolved to proxy aliases before auth and budget checks
litellm/proxy/auth/user_api_key_auth.py	Passes llm_router to all _get_model_from_request_context call sites so model ID resolution is consistent throughout the auth builder
litellm/proxy/hooks/batch_rate_limiter.py	Maps stripped batch body.model back to proxy model_name via router before can_key_call_model check
litellm/proxy/spend_tracking/budget_reservation.py	Passes llm_router to get_model_from_request in both reserve_budget_for_request and estimate_request_max_cost
litellm/proxy/auth/auth_checks.py	Adds llm_router to common_checks call forwarded to _extract_model_candidates_from_request

_{Reviews (1): Last reviewed commit: "chore: update uv.lock for 1.87.1" | Re-trigger Greptile}

mateo-berri and others added 7 commits June 3, 2026 22:53

bump: version 1.87.0 → 1.87.1

5e0e841

chore: update uv.lock for 1.87.1

0cc6385

mateo-berri marked this pull request as ready for review June 3, 2026 23:20

mateo-berri requested review from a team and tin-berri June 3, 2026 23:20

mateo-berri enabled auto-merge June 3, 2026 23:25

tin-berri approved these changes Jun 3, 2026

View reviewed changes

mateo-berri merged commit b8a9e47 into stable/1.87.x Jun 3, 2026
69 of 75 checks passed

mateo-berri deleted the litellm_cherrypick_1_87_1 branch June 3, 2026 23:28

mateo-berri mentioned this pull request Jun 4, 2026

chore(release): backport #29612 (session-token budget-ceiling exemption) into stable/1.87.x and cut 1.87.2 #29636

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(release): backport five staged fixes into stable/1.87.x and cut 1.87.1#29631

chore(release): backport five staged fixes into stable/1.87.x and cut 1.87.1#29631
mateo-berri merged 7 commits into
stable/1.87.xfrom
litellm_cherrypick_1_87_1

mateo-berri commented Jun 3, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 3, 2026

Important Files Changed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

mateo-berri commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant issues

Linear ticket

What is included

Pre-Submission checklist

CI (LiteLLM team)

Screenshots / Proof of Fix

Type

Changes

Uh oh!

codecov Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps Bot commented Jun 3, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mateo-berri commented Jun 3, 2026 •

edited

Loading

codecov Bot commented Jun 3, 2026 •

edited

Loading