Fix/azure image edit auth header by justalittleadam · Pull Request #27863 · BerriAI/litellm

justalittleadam · 2026-05-13T19:29:23Z

Verified end-to-end through LiteLLM Proxy → Azure APIM → Azure OpenAI against gpt-image-1.5 for /v1/images/edits. Returns a real edited image (~300 KB JSON payload).

Type

🐛 Bug Fix

Changes

`litellm/llms/azure/image_edit/transformation.py`

AzureImageEditConfig.validate_environment now delegates to BaseAzureLLM._base_validate_azure_environment — the canonical helper already used by every other Azure provider in the codebase (videos, vector_stores, responses, containers, assistants, batches, files, fine_tuning, …). This:

Prefers the Azure-native api-key: <key> header when an API key is available (env vars, litellm.api_key, litellm.azure_key, litellm_params.api_key, or the api_key kwarg).
Falls back to Authorization: Bearer <azure_ad_token> only when AAD auth is configured (azure_ad_token / azure_ad_token_provider in litellm_params).
Removes the previous unconditional Authorization: Bearer <api_key> header that was incorrect for Azure-style deployments.

Why delegate instead of just renaming the header

Naively swapping Authorization: Bearer → api-key: would fix subscription-key auth but silently break AAD/Entra ID auth, which legitimately needs Authorization: Bearer <token>. BaseAzureLLM._base_validate_azure_environment already encodes the correct branching and is the source of truth for Azure auth across the rest of the provider surface — image-edit was the only outlier hand-rolling its own header logic. This change brings it back in line and means future improvements to Azure auth (e.g., new token providers) automatically apply to image-edit too.

`tests/test_litellm/llms/azure/image_edit/test_azure_image_edit_transformation.py`

Three new regression tests:

test_validate_environment_uses_api_key_header_from_kwarg — passing api_key= results in an api-key header and no Authorization: Bearer leak.
test_validate_environment_uses_api_key_header_from_litellm_params — litellm_params={"api_key": ...} resolves to the api-key header.
test_validate_environment_falls_back_to_aad_bearer — when only azure_ad_token is supplied, the call falls back to Authorization: Bearer <token>, so AAD users are unaffected.

Note

Medium Risk
Changes request authentication for Azure image-edit calls, which can affect connectivity for different Azure deployments (API key vs AAD) if any edge-case header precedence differs from prior behavior.

Overview
Fixes Azure /images/edits authentication by replacing the image-edit-specific header logic with BaseAzureLLM._base_validate_azure_environment, so requests use api-key when an API key is available and only fall back to Authorization: Bearer <AAD token> when configured.

Adds regression tests covering subscription-key header usage, litellm_params["api_key"] precedence over the positional api_key, and the AAD bearer fallback path.

^{Reviewed by Cursor Bugbot for commit 7e32c65. Bugbot is set up for automated code reviews on this repo. Configure here.}

[Infra] Promote Internal Staging to main

[Infra] Promote internal staging to main

…arer Delegate `AzureImageEditConfig.validate_environment` to `BaseAzureLLM._base_validate_azure_environment` so the image-edit route follows the same auth resolution as every other Azure provider: - prefer the Azure-native `api-key` header when an API key is available - fall back to `Authorization: Bearer <azure_ad_token>` only for AAD auth The previous implementation unconditionally set `Authorization: Bearer <api_key>`, which is the OpenAI-direct convention and is rejected by Azure OpenAI / APIM-fronted deployments with `401 Access denied due to missing subscription key`. Adds regression tests covering api_key kwarg, litellm_params.api_key, and the AAD-token fallback path. Co-authored-by: Cursor <cursoragent@cursor.com>

CLAassistant · 2026-05-13T19:29:31Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ yuneng-berri
✅ ryan-crabbe-berri
❌ adamkirsteindisney
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

greptile-apps · 2026-05-13T19:32:49Z

Greptile Summary

This PR fixes a bug in AzureImageEditConfig.validate_environment that was unconditionally setting Authorization: Bearer <api_key> (OpenAI-style) instead of the api-key header expected by Azure OpenAI and API Management gateways. The fix delegates to the shared BaseAzureLLM._base_validate_azure_environment helper, aligning the image-edit route with every other Azure provider in the codebase.

litellm/llms/azure/image_edit/transformation.py: Replaced the hardcoded Authorization: Bearer logic with a call to BaseAzureLLM._base_validate_azure_environment, which prefers the api-key header for API-key auth and falls back to Authorization: Bearer <aad_token> for AAD auth. The api_key positional argument is merged into GenericLiteLLMParams before delegation, with litellm_params.api_key taking precedence over the positional api_key when both are supplied.
tests/…/test_azure_image_edit_transformation.py: Three new unit tests cover the subscription-key header path, litellm_params API-key precedence, and the AAD-token fallback; all use mocks and make no real network calls."

Confidence Score: 4/5

The fix correctly aligns the image-edit auth path with every other Azure provider and is safe to merge; the precedence inversion for callers that supply both the positional key argument and a key inside litellm_params is worth documenting.

The core change is a targeted, well-justified one-line delegation to a shared helper already proven across other Azure providers. Three new mock-only tests cover the primary auth paths. The only noteworthy subtlety is that the new merge logic silently gives litellm_params priority over the positional api_key argument when both are non-None, which is an inversion of the old behavior and could surprise callers who pass both — but no existing call site appears to do so.

No files require special attention; both changed files are small and self-contained.

Important Files Changed

Filename	Overview
litellm/llms/azure/image_edit/transformation.py	Replaces incorrect Bearer-token header with proper Azure `api-key` auth by delegating to the shared BaseAzureLLM helper; logic is consistent with other Azure providers.
tests/test_litellm/llms/azure/image_edit/test_azure_image_edit_transformation.py	Adds three well-isolated mock-only unit tests covering the fixed auth-header paths; no real network calls.

_{Reviews (1): Last reviewed commit: "fix(azure/image_edit): use api-key heade..." | Re-trigger Greptile}

codecov · 2026-05-13T19:32:50Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

oss-pr-review-agent-shin · 2026-05-13T19:38:08Z

🤖 litellm-agent: This PR is currently BLOCKED from merge.

Score: 4/5 ❌

Why blocked:

2 unresolved reviewer concerns (greptile, cla-assistant[bot]) (unresolved_concern, -1 pts)

Details: Score docked for: 2 unresolved reviewer concerns (greptile, cla-assistant[bot]).

Fix the issues above and push an update — the bot will re-review automatically.

Note: This bot is still in beta and might not always work as expected. Please share any feedback via Slack.

…sion test Address review feedback that the move to ``BaseAzureLLM._base_validate_azure_environment`` changed the relative priority of the positional ``api_key`` kwarg vs. ``litellm_params["api_key"]``. The new behavior — ``litellm_params["api_key"]`` wins, positional only fills in when ``litellm_params["api_key"]`` is empty — is intentional and matches every other Azure ``validate_environment``: ``AzureVideosConfig`` uses the exact same merge logic, while ``AzureVectorStoresConfig`` and ``AzureResponsesAPIConfig`` don't accept a positional ``api_key`` at all. The old ``or`` chain (positional wins) was the outlier and was part of the same OpenAI-vs-Azure convention drift that produced the original ``Authorization: Bearer`` bug. The only production caller (``llm_http_handler.image_edit``) sources both values from the same ``litellm_params.api_key``, so this change is behaviorally a no-op there. Document the precedence in the docstring and lock it in with an explicit test so future refactors can't quietly re-invert it. Co-authored-by: Cursor <cursoragent@cursor.com>

ishaan-berri · 2026-05-13T20:55:48Z

bugbot review

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit 7e32c65. Configure here.}

…n Bearer PR #27863 fixed Azure image edit to use the Azure-native api-key header instead of OpenAI's Authorization: Bearer convention, but did not update test_azure_image_edit_litellm_sdk to match. The test still asserted 'Authorization' in headers, which now fails since the new code routes through BaseAzureLLM._base_validate_azure_environment and emits api-key when an api_key is provided. Update the assertion to pin the correct Azure behavior: api-key header present with the resolved key, and no Authorization header.

* fix: strip Gemini thought-signature from tool_use.id in non-streaming path; example websearch config (#27873) - adapters/transformation.py: mirror the streaming path and strip the `__thought__<b64>` suffix off `tool_call.id` before building the AnthropicResponseContentBlockToolUse. Base64's `+ / =` characters violate Anthropic's `^[a-zA-Z0-9_-]+$` tool_use.id pattern, so when a conversation that flowed through Gemini is later replayed to an Anthropic-native provider (Bedrock or Anthropic API) the request 400s. - example_config_yaml/websearch_interception_config.yaml: register the interceptor under `callbacks:` not `success_callback:`. `success_callback` does not run pre-request hooks, so the tool-conversion step never fires on `/v1/messages` and the raw `web_search_20250305` tool is forwarded to Bedrock, which 400s. - adds a unit test pinning the non-streaming strip behavior and the surviving `^[a-zA-Z0-9_-]+$` shape of the resulting id. Co-authored-by: oss-agent-shin <279349115+oss-agent-shin@users.noreply.github.com> * Fix/azure image edit auth header (#27863) * fix(azure/image_edit): use api-key header instead of Authorization Bearer Delegate `AzureImageEditConfig.validate_environment` to `BaseAzureLLM._base_validate_azure_environment` so the image-edit route follows the same auth resolution as every other Azure provider: - prefer the Azure-native `api-key` header when an API key is available - fall back to `Authorization: Bearer <azure_ad_token>` only for AAD auth The previous implementation unconditionally set `Authorization: Bearer <api_key>`, which is the OpenAI-direct convention and is rejected by Azure OpenAI / APIM-fronted deployments with `401 Access denied due to missing subscription key`. Adds regression tests covering api_key kwarg, litellm_params.api_key, and the AAD-token fallback path. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(azure/image_edit): pin api-key precedence semantics + add regression test Address review feedback that the move to ``BaseAzureLLM._base_validate_azure_environment`` changed the relative priority of the positional ``api_key`` kwarg vs. ``litellm_params["api_key"]``. The new behavior — ``litellm_params["api_key"]`` wins, positional only fills in when ``litellm_params["api_key"]`` is empty — is intentional and matches every other Azure ``validate_environment``: ``AzureVideosConfig`` uses the exact same merge logic, while ``AzureVectorStoresConfig`` and ``AzureResponsesAPIConfig`` don't accept a positional ``api_key`` at all. The old ``or`` chain (positional wins) was the outlier and was part of the same OpenAI-vs-Azure convention drift that produced the original ``Authorization: Bearer`` bug. The only production caller (``llm_http_handler.image_edit``) sources both values from the same ``litellm_params.api_key``, so this change is behaviorally a no-op there. Document the precedence in the docstring and lock it in with an explicit test so future refactors can't quietly re-invert it. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: yuneng-jiang <yuneng@berri.ai> Co-authored-by: ryan-crabbe-berri <ryan@berri.ai> Co-authored-by: Adam Kirstein <adam.kirstein@disney.com> Co-authored-by: Cursor <cursoragent@cursor.com> * test(azure/image_edit): expect api-key header instead of Authorization Bearer PR #27863 fixed Azure image edit to use the Azure-native api-key header instead of OpenAI's Authorization: Bearer convention, but did not update test_azure_image_edit_litellm_sdk to match. The test still asserted 'Authorization' in headers, which now fails since the new code routes through BaseAzureLLM._base_validate_azure_environment and emits api-key when an api_key is provided. Update the assertion to pin the correct Azure behavior: api-key header present with the resolved key, and no Authorization header. --------- Co-authored-by: oss-agent-shin <ext-agent-shin@berri.ai> Co-authored-by: oss-agent-shin <279349115+oss-agent-shin@users.noreply.github.com> Co-authored-by: Adam Kirstein <107421694+justalittleadam@users.noreply.github.com> Co-authored-by: yuneng-jiang <yuneng@berri.ai> Co-authored-by: ryan-crabbe-berri <ryan@berri.ai> Co-authored-by: Adam Kirstein <adam.kirstein@disney.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com>

* fix(proxy): always merge caller-supplied tags into request metadata Caller-supplied tags (`x-litellm-tags` header, body `tags`, `metadata.tags`) were silently dropped unless the key/team had `metadata.allow_client_tags: true` set. Restore the documented behavior: tags from the request always flow into `metadata.tags` and union with any admin-configured static tags from key/team/project metadata. Removes the `allow_client_tags` opt-in flag from the pre-call pipeline. The flag was only ever read here; it has no schema or endpoint footprint, so leftover values in existing key metadata are inert. Test cleanup mirrors the simplification: drop the three tests that verified the strip-when-not-opted-in path, drop the `allow_client_tags` fixture lines from the merge/union tests. * docs(proxy): refresh stale comments referencing removed tag strip The tag-strip block was removed in the parent commit but two surrounding comments still referenced "tags without opt-in" and "runs AFTER the strip". Update them to describe the remaining user_api_key_* and _pipeline_managed_guardrails strip that the snapshot/merge ordering actually protects against. * fix(tests): swap dall-e to gpt-image-1 after openai deprecation DALL-E 2 and DALL-E 3 were removed from the OpenAI API on 2026-05-12, causing e2e image-generation tests to fail with "model does not exist". Swap all live-API DALL-E references in proxy-backed tests to gpt-image-1 and update the dall-e-2 alias in proxy_server_config.yaml to point at openai/gpt-image-1 (preserves any historical dall-e-2 callers). * fix(tests): drop dall-e-only test classes; route live image tests via gpt-image-1 Second wave of failures from the 2026-05-12 DALL-E shutdown: - tests/image_gen_tests/test_image_edits.py::TestOpenAIImageEditDallE2 and tests/image_gen_tests/test_image_generation.py::TestOpenAIDalle3 are explicitly named for the deprecated models and can't pass; remove. gpt-image-1 coverage already exists in sibling classes. - tests/local_testing/test_router.py image gen tests use dall-e-3 only as a routing example; swap to gpt-image-1. - tests/local_testing/test_custom_callback_input.py image_generation success/failure paths swapped to gpt-image-1. * chore: reject bare str at file-input sinks to prevent local-file read (#27762) * chore: reject bare str at file-input sinks to prevent local-file read (#27667) Squash-merged by litellm-agent from stuxf's PR. * fix: use os.PathLike in ocr sink and check truthy reasoningSummary for bridge - ocr/main.py: widen Path check to os.PathLike for consistency with other sinks - main.py: bridge condition checks truthiness of reasoning_summary, not just None Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> * fix: remove unused pathlib.Path import in ocr/main.py --------- Co-authored-by: yuneng-jiang <yuneng@berri.ai> Co-authored-by: ryan-crabbe-berri <ryan@berri.ai> Co-authored-by: stuxf <70670632+stuxf@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> * fix(tests): swap dall-e to gpt-image-1 after openai deprecation DALL-E 2 and DALL-E 3 were removed from the OpenAI API on 2026-05-12, causing e2e image-generation tests to fail with "model does not exist". Swap all live-API DALL-E references in proxy-backed tests to gpt-image-1 and update the dall-e-2 alias in proxy_server_config.yaml to point at openai/gpt-image-1 (preserves any historical dall-e-2 callers). * fix(tests): drop dall-e-only test classes; route live image tests via gpt-image-1 Second wave of failures from the 2026-05-12 DALL-E shutdown: - tests/image_gen_tests/test_image_edits.py::TestOpenAIImageEditDallE2 and tests/image_gen_tests/test_image_generation.py::TestOpenAIDalle3 are explicitly named for the deprecated models and can't pass; remove. gpt-image-1 coverage already exists in sibling classes. - tests/local_testing/test_router.py image gen tests use dall-e-3 only as a routing example; swap to gpt-image-1. - tests/local_testing/test_custom_callback_input.py image_generation success/failure paths swapped to gpt-image-1. * fix(proxy): always merge caller-supplied tags into request metadata Caller-supplied tags (`x-litellm-tags` header, body `tags`, `metadata.tags`) were silently dropped unless the key/team had `metadata.allow_client_tags: true` set. Restore the documented behavior: tags from the request always flow into `metadata.tags` and union with any admin-configured static tags from key/team/project metadata. Removes the `allow_client_tags` opt-in flag from the pre-call pipeline. The flag was only ever read here; it has no schema or endpoint footprint, so leftover values in existing key metadata are inert. Test cleanup mirrors the simplification: drop the three tests that verified the strip-when-not-opted-in path, drop the `allow_client_tags` fixture lines from the merge/union tests. * docs(proxy): refresh stale comments referencing removed tag strip The tag-strip block was removed in the parent commit but two surrounding comments still referenced "tags without opt-in" and "runs AFTER the strip". Update them to describe the remaining user_api_key_* and _pipeline_managed_guardrails strip that the snapshot/merge ordering actually protects against. * feat(ui): add Vertex AI Search as vector store provider (#27790) * feat(ui): add Vertex AI Search as vector store provider Adds a "Vertex AI Search" entry to the provider dropdown (custom_llm_provider=vertex_ai/search_api) with fields for project, location (global/us/eu select), and optional collection ID. Extends VectorStoreFieldConfig with `options` so select fields can be data-driven instead of falling through to the embedding-model list. * fix(ui): clarify vertex_collection_id placeholder copy Placeholder previously displayed "default_collection" — the literal fallback value — which invited users to type it instead of leaving the field blank. Switch to an example placeholder and tighten the tooltip. * Litellm key rotation bug (#27756) * fix(proxy): resolve cache handling issues in _lookup_deprecated_key - Updated the in-memory cache for deprecated key lookups to store a 3-tuple (active_token_id, cache_expires_at_ts, revoke_at_ts) instead of a 2-tuple, ensuring proper unpacking and backward compatibility. - Removed duplicate cache reads and added logic to handle legacy cache entries gracefully. - Enhanced unit tests to cover scenarios for cache hits, DB misses, and respect for revoke_at timestamps, ensuring robust handling of the grace-period key-rotation feature. * refactor(proxy): streamline cache handling in _lookup_deprecated_key - Simplified the cache retrieval logic by directly unpacking the 3-tuple cache entries, removing the need for backward compatibility checks for 2-tuple entries. - Updated unit tests to ensure that pre-warmed 3-tuple cache entries are served correctly without unnecessary database lookups. * chore(ci): add new unit test for deprecated key grace period - Included `test_deprecated_key_grace_period.py` in the CI workflow to enhance coverage for deprecated key handling scenarios. * fix(proxy): remove unnecessary check for revoke_at in _lookup_deprecated_key - Eliminated the redundant check for None on revoke_at, streamlining the logic for handling deprecated keys in the cache. This change enhances the efficiency of the key lookup process. * test(proxy): add end-to-end tests for deprecated key lookup behavior - Introduced a new test class `TestDeprecatedKeyLookupDbE2E` to validate the behavior of deprecated key lookups against a real Prisma-backed database. - The test ensures that old key hashes resolve correctly and that repeated lookups utilize the in-memory cache without errors. - Cleaned up the `_lookup_deprecated_key` function by removing an unnecessary check for `revoke_at`, enhancing the efficiency of the key lookup process. * chore(proxy): close /key/regenerate ownership-rebind + premium-gate bypass A non-admin caller could rebind their own key's ``user_id`` via ``/key/regenerate``. ``_execute_virtual_key_regeneration`` had org/team guards but no ``user_id`` guard, and ``prepare_key_update_data`` did not strip the field — it survived ``model_dump(exclude_unset=True)`` into the Prisma update. On the next request, ``_return_user_api_key_auth_obj`` resolved the rebound ``user_id`` against ``litellm_usertable`` and returned ``PROXY_ADMIN`` whenever the target row's ``user_role`` was admin (e.g. the default ``user_id="default_user_id"`` created on first password-UI login). ``/key/update`` had the equivalent guard inline at ``_validate_update_key_data``; extract it to a shared helper ``_validate_caller_can_change_key_ownership`` and call from both ``/key/update`` and ``_execute_virtual_key_regeneration``. Future regenerate-style endpoints inherit the guard for free. Also tighten the premium gate that allowed the master-key rotation branch to skip the enterprise check. The previous predicate was ``data.new_master_key is not None`` — a field-presence test, not an identity check. Any non-premium caller could send any value in that field and the premium check would no-op. Verify the caller actually holds the master key via ``_is_master_key`` before allowing the non-premium path. Tests: - ``test_regenerate_user_id_rebind_guard`` — parametrized table over cross-user rebind (blocked), empty-string removal (blocked), and same-user no-op rebind (allowed). - ``test_regenerate_premium_gate_requires_actual_master_key`` / ``test_regenerate_premium_gate_allows_actual_master_key_holder`` — ensure the premium check requires the caller actually present the master key, and that legitimate master-key rotation still works. * test(vcr): classify cache verdicts, detect live calls, surface cost leaks Convert the per-test VCR verdict line from a single 'NOOP / HIT / MISS / PARTIAL' tag into a classified outcome that distinguishes the cases that silently bill the live API on every CI run from the ones that don't: HIT pure replay PARTIAL mixed replay + new recordings MISS:RECORDED new cassette saved to Redis (cached next run) MISS:OVERFLOW cassette > MAX_EPISODES_PER_CASSETTE; persister refused to save; re-bills every run MISS:NOT_PERSISTED test failed; save_cassette skipped; re-bills NOOP VCR-marked but no HTTP traffic (mocked elsewhere) UNMARKED:LIVE_CALL test bypassed VCR AND opened a TCP connection to a known LLM provider host -> wasted spend UNMARKED:NO_TRAFFIC test bypassed VCR but didn't call out The UNMARKED:LIVE_CALL signal is what converts 'this test probably hits live' into 'this test connected to api.openai.com'. We install a socket.connect / socket.create_connection wrapper for the duration of each non-VCR-marked test and record any outbound TCP to a known LLM provider hostname. The probe sits below the httpx layer so vcrpy and respx (which both patch above the socket) are unaffected. Replace the file-level _RESPX_CONFLICTING_FILES blacklists in the llm_translation and local_testing conftests with per-item respx detection in apply_vcr_auto_marker_to_items. A test now skips VCR when it actually carries @pytest.mark.respx or has respx_mock in its fixture chain - not just because some other test in the same file imports MockRouter. Items skipped by skip_files are split into respx_conflict (real conflict, the module wires up respx) vs file_opt_out (dead skip- list entry whose module never touches respx) so the session summary makes pruning obvious. Stabilize the AWS SigV4 fingerprint: the Authorization header on Bedrock requests rotates its Credential date and Signature on every call, which previously pushed every Bedrock test past the 50-episode overflow threshold. Extract the access-key id only ('aws-sigv4:AKIA...') so two requests with the same identity match. Always emit verdict logging when VCR is active (set LITELLM_VCR_VERBOSE=0 to opt back into the legacy quiet mode). Add a session-end classification summary that lists overflow tests, unmarked live-call tests, and the skip-reason breakdown. Wire the live-call probe + summary hook into every test directory that already uses the Redis-backed VCR cache (audio_tests, guardrails_tests, image_gen_tests, litellm_utils_tests, llm_responses_api_testing, llm_translation, local_testing, logging_callback_tests, ocr_tests, pass_through_unit_tests, router_unit_tests, search_tests, unified_google_tests). Add tests/llm_translation/test_vcr_classification.py covering the verdict classifier, skip-reason tagging, AWS SigV4 fingerprint stability, live-host classification, and session summary rendering. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> * test(vcr): drop dead 'from respx import MockRouter' imports These seven test files were on _RESPX_CONFLICTING_FILES, which made the auto-marker skip them entirely. Inspecting the source shows the only respx artifact is a top-level 'from respx import MockRouter' that no test ever uses - no @pytest.mark.respx, no respx_mock fixture, no respx.mock context manager. The import is dead code left over from a previous mocking pattern. Now that apply_vcr_auto_marker_to_items detects respx per-item via the marker / fixture chain (b637d9f64a), the file-level skip is no longer needed for these files - they were the reason the OpenAI tests (test_o3_reasoning_effort, test_streaming_response[o1/o3-mini], TestOpenAIO1::test_streaming, TestOpenAIChatCompletion::test_web_search, TestOpenAIO3::test_web_search, etc.) ran live every CI build despite the cassette cache being healthy. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> * test(image_edits): regenerate fixtures per call instead of holding open module-level file handles Module-level TEST_IMAGES = [ open(os.path.join(pwd, 'ishaan_github.png'), 'rb'), open(os.path.join(pwd, 'litellm_site.png'), 'rb'), ] SINGLE_TEST_IMAGE = open(...) opens the file once at import. After the first multipart upload, the file pointer is at EOF, so every subsequent test in the same xdist worker sends an empty multipart body. That non-determinism (a) blows the recorded cassette past MAX_EPISODES_PER_CASSETTE (50) so _RedisPersister.save_cassette refuses to save it, and (b) re-bills the live image edit endpoint on every CI run. Recent CI runs confirm the leak: tests/image_gen_tests/test_image_edits.py shows six tests parking at 51-52 cassette entries (TestOpenAIImageEditGPTImage1::test_openai_image_edit_litellm_sdk[False], TestOpenAIImageEditDallE2::..., test_openai_image_edit_with_bytesio, test_openai_image_edit_litellm_router, test_multiple_vs_single_image_edit[False], test_multiple_image_edit_with_different_formats). Replace the module-level file handles with _make_test_images() / _make_single_test_image() factories that return fresh _RewindableImage (BytesIO subclass) objects whose pointer always starts at 0. The image bytes are read once at import into module-level constants (_ISHAAN_GITHUB_BYTES, _LITELLM_SITE_BYTES), so disk I/O cost is unchanged. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> * chore(proxy): clarify ownership-rebind error message (actor vs target) Previous wording read "User=<new_owner> is not allowed to update the key to belong to user=<current_owner>" — easy to misread as "caller wants to keep the key on its current owner". Reframe as "Non-admin caller is not allowed to rebind the key from user=<existing> to user=<incoming>" so the direction of the failed operation is unambiguous. Same shape preserved (HTTPException 403); only the ``detail`` string changes. Regression test substring updated. * fix(vcr): match real Bedrock hostnames in live-call probe The suffix '.bedrock-runtime.amazonaws.com' never matched real Bedrock endpoints, which use the format 'bedrock-runtime[-fips].{region}.amazonaws.com' (region between 'bedrock-runtime' and 'amazonaws.com'). Add an explicit host check for that pattern so Bedrock live calls are visible to the probe, and update the unit test accordingly. Also drop the unused '_LIVE_CALL_PROBE_INSTALLED' module variable. * test(proxy): drop allow_client_tags opt-in gate and add credential rename cascade tests Removes the allow_client_tags metadata check from apply_client_tag_policy_pre_auth so x-litellm-tags headers are always merged into request metadata, matching the post-auth behavior in add_litellm_data_to_request. Updates pre-call tests accordingly and adds a new test suite covering cascading credential renames into model rows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(proxy): block explicit-null user_id in ownership rebind guard ``model_dump(exclude_unset=True)`` in ``prepare_key_update_data`` includes any field the caller explicitly set, even when the value is ``None``. The previous guard short-circuited on ``getattr(data, 'user_id', None) is None``, which conflated "field omitted" (safe) with "field explicitly set to null" (writes NULL to the token row, detaching the key from its user and bypassing user-row role checks). Switch the omitted-vs-set distinction to ``data.model_fields_set``; treat explicit-null and explicit-empty-string identically as a removal attempt, both 403-rejected for non-admin callers. Parametrized regression adds ``explicit_null_blocked`` alongside the existing ``rebind_blocked`` / ``empty_blocked`` / ``same_user_id_allowed`` cases. * fix(vcr): cover full RFC1918 172.16.0.0/12 range in local prefixes * fix(image_edits): drop _RewindableImage to prevent infinite multipart upload The _RewindableImage(BytesIO) wrapper auto-rewound on every read after EOF, which made the OpenAI SDK's multipart upload writer read the same bytes forever instead of seeing EOF. Workers OOM'd / SIGKILL'd: [gw0] node down: Not properly terminated replacing crashed worker gw0 ... worker 'gw1' crashed while running 'tests/image_gen_tests/test_image_edits.py::TestOpenAIImageEditGPTImage1::test_openai_image_edit_litellm_sdk[False]' The auto-rewind was added defensively for parametrized + flaky-retried tests, but BaseLLMImageEditTest::test_openai_image_edit_litellm_sdk already calls get_base_image_edit_call_args() once per invocation and that helper now constructs fresh streams via _make_test_images(), so rewinding inside the stream is unnecessary. Replace with plain BytesIO seeded with the cached image bytes. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> * chore(proxy): refuse remote-URL instance-fn loads outside config-file path ``get_instance_fn`` previously routed any ``s3://`` / ``gcs://`` value into ``_load_instance_from_remote_storage`` regardless of how the value got there. The function ultimately calls ``spec.loader.exec_module(module)`` — Python in the proxy process. On admin-callable endpoints that accept a ``target`` / ``custom_handler`` field from the request body (e.g. ``/config/pass_through_endpoint``, custom-callback registration), that is a one-step admin-to-RCE primitive: any future privilege-escalation bug becomes immediate code execution. The documented operator flow for remote-module loading is ``litellm_settings.callbacks: ["s3://bucket/module.instance"]`` in ``config.yaml``. That path always carries the YAML's ``config_file_path`` through to ``get_instance_fn``. Use the presence of ``config_file_path`` as the discriminator: refuse remote URLs when it is absent (the request-body path) unless the operator explicitly opts back in via ``LITELLM_ALLOW_REMOTE_INSTANCE_FN_FROM_API=true``. The three success/failure/audit-log callback-loop call sites in ``proxy_server.py:load_config`` were already running inside the startup config-file load but had stopped threading ``config_file_path`` through. Pass it through so the documented ``s3://`` callback flow continues to work unchanged. Tests cover: remote URL without ``config_file_path`` raises; remote URL with the opt-in env reaches the loader; remote URL with ``config_file_path`` passes (documented startup flow); local dotted-name imports unaffected. * fix(proxy): parse string metadata before pre-auth tag merge `apply_client_tag_policy_pre_auth` overwrote string-typed metadata with `{}` before merging header tags, dropping any tags inside. A caller could send `metadata='{"tags":["over-budget"]}'` plus `x-litellm-tags: within-budget` and bypass `_tag_max_budget_check` on the body tag. Parse the string via `safe_json_loads` first so existing tags survive the merge. Also drop the empty `tests/test_litellm/proxy/credential_endpoints/` directory — the cascade-rename tests it held imported a function that was never implemented (out of scope for this PR). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(tests): thread config_file_path through s3/gcs custom-logger tests The pre-existing s3:// / gcs:// custom-logger tests called ``get_instance_fn`` without ``config_file_path``, which means the new runtime gate (refuse remote URLs unless invoked from a config-file load) now raises ``ValueError`` before reaching the mocked download paths. Each test was exercising the documented startup config-file load scenario; pass ``config_file_path="/any/path"`` to make that intent explicit and route past the gate. Affected: test_s3_download_success, test_gcs_download_success, test_invalid_url_format, test_download_failure_handling, test_file_cleanup. * test(vcr): mark Bedrock prompt-caching cross-call tests VCR-incompatible The pass_through prompt-caching tests (test_prompt_caching_returns_cache_read_tokens_on_second_call, test_prompt_caching_streaming_second_call_returns_cache_read) make a warm-up call and then assert the *second* call sees a non-zero cache_read_input_tokens count from the upstream's prompt-cache. VCR replay can't model cross-call provider state — both calls match the same cassette episode, so the second call returns the first call's pre-warmup response and the assertion fails: AssertionError: Expected cache_read_input_tokens > 0 on second call, but got 0. Full usage: {'input_tokens': 4986, 'cache_creation_input_tokens': 4974, 'cache_read_input_tokens': 0} This started biting after the AWS SigV4 fingerprint stabilization (b637d9f64a): Bedrock requests now produce a stable per-access-key fingerprint instead of a per-request signature, so cassettes successfully replay where they previously always missed and re-recorded live. Opt these tests out via skip_nodeid_suffixes so they run live and match the existing pattern in tests/llm_translation/conftest.py (::test_prompt_caching). Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> * Fix 3 OpenTelemetry tracing bugs in proxy integration (#27757) 1. Missing litellm_request child span when proxy parent in metadata: _get_span_context now returns (ctx, None) for the metadata-injected proxy parent so the primary span is always emitted as a child of ctx. Proxy span lifecycle managed by new _end_proxy_span_from_kwargs. 2. open_telemetry_logger overwrite by later handlers: _init_otel_logger_on_litellm_proxy now uses first-registered-wins — only assigns proxy_server.open_telemetry_logger when currently None. 3. Duplicate litellm_request success spans in streaming paths: Added _mark_success_span_once with per-handler dedupe key stored in kwargs metadata, suppressing the second span when both sync and async success callbacks fire for the same request. Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: update Next.js build artifacts (2026-05-13 01:42 UTC, node v20.20.2) * test(vcr): tighten OVERFLOW classification and switch respx detection to AST Address two greptile P2 review concerns on PR #27795: 1. MISS:OVERFLOW was firing whenever total > MAX_EPISODES_PER_CASSETTE regardless of cassette state. A cassette that grew past the cap historically but this run only *replayed* (dirty=False) is healthy — the persister never tries to save, so the cache state is stable and the next run will replay too. Only flag OVERFLOW when dirty=True (new episodes were recorded that the persister would refuse to save). Add a regression test covering the dirty=False + large-total case. 2. _module_uses_respx did substring matching on the module source, which false-positives on comments / docstrings / string literals. A comment like # Previously tried respx.mock but switched to vcrpy would keep a file pinned on the opt-out list, defeating the dead-import pruning goal of this PR. Replace the substring scan with an ast.NodeVisitor (_RespxUsageVisitor) that only counts: - @pytest.mark.respx / @respx.mock decorators - with respx.mock(): ... (sync + async) context managers - respx.mock(...) calls outside a with/decorator - function parameters / fixture names equal to respx_mock Add tests for the comment / docstring / string-literal cases plus each real-usage pattern. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> * fix(types_utils): drop opt-in env from remote-module runtime gate The runtime gate on s3://gcs:// loading in get_instance_fn previously allowed an opt-in via LITELLM_ALLOW_REMOTE_INSTANCE_FN_FROM_API. That env var is admin-flippable at runtime (DB-overlay environment_variables flow into os.environ), which defeats the gate's purpose, and it isn't needed for the documented operator flow: config.yaml callbacks always pass config_file_path through to the loader. Remove the helper, raise unconditionally when config_file_path is None, and drop the corresponding test for the opt-in branch. * fix(proxy): thread config_file_path into pass-through and MCP-tool YAML loaders The previous commit's gate broke two legitimate startup paths for operators using s3://gcs:// remote module loading from their config.yaml: - general_settings.pass_through_endpoints[].custom_handler - mcp_tools[].handler Both call sites called get_instance_fn without a config_file_path, so the new gate rejected them at startup. Thread config_file_path through: - create_pass_through_route accepts config_file_path and forwards it to get_instance_fn. add_exact_path_route, add_subpath_route, _register_pass_through_endpoint, and initialize_pass_through_endpoints accept and propagate it. - The YAML-load call site in proxy_server.load_config now passes config_file_path; the DB-overlay call site in _update_general_settings leaves it as the default None so the gate still fires on admin-written s3:// values. - MCPToolRegistry.load_tools_from_config accepts config_file_path and threads it into get_instance_fn; _init_non_llm_configs forwards it from load_config. Adds two regression tests verifying that the YAML-source callers thread the path through to get_instance_fn. * Strip SERVER_ROOT_PATH before lazy-feature prefix match LazyFeatureMiddleware compared the raw scope path against registered prefixes (e.g. /policies), so requests under a server root path like /api/v1/policies/... never matched, the feature never loaded, and the endpoint returned 404. Strip the configured root path before matching, normalizing trailing slashes and enforcing a component boundary so /api does not falsely match /apiv2. * Cache normalized SERVER_ROOT_PATH at middleware init SERVER_ROOT_PATH is a process-startup env var. Read it once in __init__ instead of calling get_server_root_path() + rstrip on every request that arrives before all lazy features have loaded. * test: replace dall-e-3 with gpt-image-1 in health check and router tests (#27813) OpenAI returns 'The model dall-e-3 does not exist' for the test account, breaking test_openai_img_gen_health_check and test_image_generation. Switch to gpt-image-1, matching the existing TestOpenAIGPTImage1 pattern. * fix(gemini): normalize response_schema on native generateContent (#27775) * fix(gemini): normalize response_schema on native generateContent The /v1beta/models/{model}:generateContent passthrough forwarded generationConfig.response_schema verbatim, so schemas containing $defs, $ref, anyOf-with-null, default, or title were rejected by Gemini even though /chat/completions already handles them. GoogleGenAIConfig.transform_generate_content_request now calls a new _normalize_response_schema helper that mirrors the chat/completions path: Gemini 2.0+ models get the schema promoted to responseJsonSchema via _build_json_schema (preserving $defs/$ref natively), older models keep responseSchema but the schema is flattened with _build_vertex_schema. VertexAIGoogleGenAIConfig (which overrides the transform entirely) calls the same helper before building the request. * fix(gemini): preserve caller-supplied responseJsonSchema when responseSchema co-present Previously, when both responseJsonSchema and responseSchema were present on Gemini 2.0+, _normalize_response_schema processed responseJsonSchema first (no-op normalization) then unconditionally promoted responseSchema to responseJsonSchema, clobbering the caller-supplied value. Now skip the promotion (and drop the redundant responseSchema) when the caller already supplied responseJsonSchema. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> * chore: strip restating comments from response-schema normalize Drop the docstring on _normalize_response_schema and the two inline comments that just restated what the surrounding code/asserts already say. Function name + variable names carry the intent; PR description covers the why-it-exists context. * perf(gemini): drop redundant deepcopy on responseJsonSchema normalize _build_json_schema is a no-op (returns its argument unchanged), so the deepcopy + round-trip on the responseJsonSchema branch allocated a full schema copy on every request with no observable effect. Forward the caller's value as-is, and just move the popped responseSchema value when promoting on Gemini 2.0+. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> * style: remove unneeded comment * fix(gemini): drop unsupported responseJsonSchema for older models * test(gemini): add parity test between native and chat schema normalization Per @Sameerlite review: lock the two Gemini schema-normalization paths together. If either GoogleGenAIConfig._normalize_response_schema (native generateContent) or VertexGeminiConfig.apply_response_schema_transformation (/chat/completions) drifts, the parity test fails — forcing both to be updated together. * fix(google_genai): preserve key naming convention in _normalize_response_schema When the input schema key is snake_case (response_schema), the promoted JSON schema key should also be snake_case (response_json_schema) instead of mixing in camelCase (responseJsonSchema). This matters for the Vertex AI google_genai path which converts all keys to snake_case before calling _normalize_response_schema. --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> * fix(vcr): aggregate worker stats on the controller so the session summary actually renders under xdist `_session_stats` is a module-level dict mutated inside `_vcr_outcome_gate` — which runs in each xdist worker process. The controller's `pytest_terminal_summary` then reads its own empty `_session_stats` and bails on `if not counts: return`, so the OVERFLOW / LIVE_CALL sections the rest of this PR adds never make it into CI logs in the dist mode CI actually uses. Ship a structured `vcr_outcome` payload via `user_properties` (which xdist round-trips) and add `aggregate_report_outcome` on the controller to fold worker outcomes into `_session_stats`. The recording process tags `vcr_recorded_by` with `PYTEST_XDIST_WORKER` so the controller can tell "single-process — already counted locally" apart from "produced by a worker — needs aggregation here", and not double-count when there's no xdist. Covered by 9 new unit tests in test_vcr_classification.py including the end-to-end summary render path. * fix(responses): register cooldowns on failure + fail fast on stale encrypted_content (#27820) * feat(proxy): skip disable_background_health_check models on GET /health when flag set (#27716) * feat(proxy): skip disable_background_health_check models on GET /health when flag set Co-authored-by: Cursor <cursoragent@cursor.com> * fix comment * fix greptile comments * Fix health check fallback kwargs * Format health endpoint * Harden direct health check kwargs compatibility for monkeypatched perform_health_check Replace substring-based TypeError detection with unexpected-keyword checks and a short retry chain (full kwargs, instrumentation only, filter only, minimal) so partial stubs work regardless of which optional kwarg fails first. Add proxy unit tests for legacy three-arg stubs and single-kwarg variants. Co-authored-by: Sameer Kankute <Sameerlite@users.noreply.github.com> * fix black --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Sameer Kankute <Sameerlite@users.noreply.github.com> * fix(bedrock-converse): drop blank-text fallback for empty thinking blocks (#27850) * fix(bedrock-converse): drop blank-text fallback for empty thinking blocks Claude Code with extended thinking replays prior assistant turns that include an empty thinking block (`thinking=""`, `signature=""`) alongside tool_use blocks. The unsigned-reasoning fallback in `add_thinking_blocks_to_assistant_content` was emitting `BedrockContentBlock(text="")`, which Bedrock Converse rejects with: "The text field in the ContentBlock object at messages.X.content.0 is blank." Guard the fallback with a strip() check, matching the existing empty-text guards elsewhere in `_bedrock_converse_messages_pt`. * style: remove unneeded comments * fix(proxy): thread config_file_path through LiteLLM_JWTAuth.custom_validate LiteLLM_JWTAuth.__init__ calls get_instance_fn(custom_validate) without config_file_path, so an operator who configures custom_validate: s3://bucket/module.fn in their YAML JWT auth section would hit the runtime gate on startup and break their deployment. Accept config_file_path as a non-field kwarg (popped before the invalid-keys check), thread it into get_instance_fn, and pass it from the startup-load callsite via the existing user_config_file_path module-level path. Admin-API JWT config writes leave the kwarg at None and still hit the gate. * fix(mcp): surface upstream 401 for token-forwarding MCP servers (#27847) * fix(mcp): surface upstream 401 for token-forwarding MCP servers For MCP servers configured with extra_headers: [Authorization], the gateway forwards the client token directly to the upstream. When that token is rejected (expired or invalid) the upstream returns 401, but the MCP SDK starts the SSE stream with 200 OK before calling handlers, so the 401 can't be returned mid-stream. Fix: add a pre-flight httpx probe in handle_streamable_http_mcp — before the SDK opens the session — so the gateway can still return HTTP 401 with WWW-Authenticate: Bearer authorization_uri=<gateway-discovery-url> when the upstream rejects the token. The probe fails-open (returns 200) on network errors so a transient hiccup does not block valid requests. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): parallelize pre-flight auth probes and use HEAD to avoid side effects - Extract forwarded_auth outside the pass-through server loop (was called N times for the same scope value) - Gather all upstream auth probes concurrently with asyncio.gather instead of sequentially; eliminates N×5 s worst-case latency - Switch probe from POST+initialize JSON-RPC body to HEAD request; HEAD carries the Authorization header so the upstream rejects invalid tokens with 401 but never allocates a session or writes an audit entry Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): use get_async_httpx_client in _probe_upstream_auth Replaces bare httpx.AsyncClient with the project-standard get_async_httpx_client(httpxSpecialProvider.MCP) to satisfy the ensure_async_clients_test code coverage check and avoid the +500 ms per-request overhead of creating a new client on every probe call. Co-authored-by: Cursor <cursoragent@cursor.com> * refactor(mcp): extract pre-flight probe into _check_passthrough_upstream_auth Moves the parallel upstream auth probe logic out of handle_streamable_http_mcp into a dedicated helper to satisfy Ruff PLR0915 (Too many statements > 50). Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): gate pre-flight probes on authorized server set to prevent bypass _check_passthrough_upstream_auth was resolving user-supplied server names directly before authorization ran, letting any permitted LiteLLM key trigger an upstream HEAD probe to a server it was not allowed to use. Changes: - Call _get_allowed_mcp_servers inside the helper so only servers the caller's key is authorized for are probed. - Move the call site to after toolset scoping so the auth context is fully resolved before the probe list is built. - Thread user_api_key_auth into the helper signature (replaces the raw mcp_servers name list). Co-authored-by: Cursor <cursoragent@cursor.com> * Add async HTTP HEAD support Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(mcp): use Scope type annotation in _get_forwarded_auth_from_scope Co-authored-by: Cursor <cursoragent@cursor.com> * Fix MCP upstream auth probe method Co-authored-by: Yassin Kortam <yassin@berri.ai> * Remove unused AsyncHTTPHandler head method Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(mcp): exclude has_client_credentials servers from pre-flight auth probe _prepare_mcp_server_headers skips caller Authorization when the server uses OAuth client-credentials (M2M), but the pre-flight probe was still selecting those servers and forwarding the caller's raw token in the HEAD request. Exclude servers with has_client_credentials from the probe list to match the actual downstream header-preparation logic. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): propagate upstream 403 as 403, not 401 with WWW-Authenticate Per RFC 9110, 401 means "go get new credentials." Mapping an upstream 403 to a gateway 401 causes OAuth clients to restart the authorization flow, obtain a fresh token with identical scopes, hit 403 again, and loop indefinitely. 401 from upstream → gateway 401 + WWW-Authenticate (re-authorize) 403 from upstream → gateway 403 (no WWW-Authenticate hint) Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): skip auth probe when Authorization may be the LiteLLM proxy key The pre-flight upstream probe must not forward the caller's Authorization header when it could itself be the LiteLLM proxy API key. Restrict the probe to requests that supply x-litellm-api-key explicitly — only then is the Authorization header unambiguously the upstream OAuth token the caller wants forwarded. * Fix MCP ASGI HTTPException propagation Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(mcp): use public AsyncHTTPHandler.post() in auth probe Use AsyncHTTPHandler.post() and catch httpx.HTTPStatusError explicitly so the 401/403 we want to surface is not silently swallowed by the broad fail-open except Exception block. Avoids reaching into the handler's private client attribute, which would silently regress to fail-open if AsyncHTTPHandler is ever refactored. * Fix MCP auth probe tests Co-authored-by: Yassin Kortam <yassin@berri.ai> * test(mcp): add coverage for httpx.HTTPStatusError path in auth probe AsyncHTTPHandler.post() calls raise_for_status() internally, so a real upstream 401/403 lands as httpx.HTTPStatusError. Add a test that exercises that specific exception path so a regression that swallows the error in the broad fail-open except Exception would be caught. --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Yassin Kortam <yassin@berri.ai> Co-authored-by: claude-bot <claude-bot@anthropic.com> * fix(cost): align vertex_ai/gemini-embedding-2-preview with Vertex multimodal pricing (#27848) * fix(cost): align vertex_ai/gemini-embedding-2-preview with Vertex multimodal pricing Co-authored-by: Cursor <cursoragent@cursor.com> * fix(cost): align vertex_ai/gemini-embedding-2 GA source URL with preview Per Greptile review on #27848: GA entry referenced ai.google.dev while the preview entry was updated to the canonical Vertex AI pricing page. Both share identical pricing values; sync the source URL for consistency. https://claude.ai/code/session_01W8jRwstnmduadGw8Z8egxe --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Claude <noreply@anthropic.com> * feat(mcp): add delegate_auth_to_upstream flag for PKCE passthrough (#27834) * feat(mcp): add delegate_auth_to_upstream flag for PKCE passthrough Adds an opt-in per-server flag that lets clients (e.g. VS Code) complete PKCE directly with an upstream OAuth2 MCP server, instead of LiteLLM double-gating with its own API-key/SSO check. Only honored when auth_type=oauth2 and the operator explicitly sets the flag; mixed-target or non-oauth2 requests fail closed. - Adds the field to Pydantic models, Prisma schema, and a migration - New MCPRequestHandler._target_servers_delegate_auth_to_upstream gate that runs only when no x-litellm-api-key is present, so authenticated users still get user_id resolution + stored-credential lookup - Anonymous callers now see delegate servers in get_allowed_mcp_servers (scoped to delegate servers only; the upstream still enforces auth) - mcp_management_endpoints: allow anonymous /authorize and /token for delegate servers so VS Code can complete PKCE without a LiteLLM session - UI toggle (shown only for oauth2) + payload/view wiring - Tests covering: oauth2 on/off, non-oauth2 with flag, mixed targets, no resolvable target, explicit key precedence, and 401 emission Co-authored-by: Cursor <cursoragent@cursor.com> * Enforce oauth2 for delegated MCP auth bypass Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(mcp): close secondary Authorization bypass for delegate servers The delegate-auth bypass gated only on the primary `x-litellm-api-key` header, so a LiteLLM key sent via `Authorization: Bearer sk-...` (the secondary header) was silently dropped — skipping spend tracking and rate limiting. Gate on the resolved litellm_api_key (which considers both headers) so the bypass fires only when neither is present. Also update the existing "Authorization header present" test to reflect that an upstream OAuth token now flows through the existing oauth2 fallback (LiteLLM auth attempt → fail → anonymous), not via the delegate branch. Co-authored-by: Cursor <cursoragent@cursor.com> * Avoid duplicate MCP OAuth credential lookup Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(mcp): block delegate bypass for M2M and internal-only servers Two security issues flagged in code review: 1. High – client_credentials (M2M) servers must not be delegatable: LiteLLM auto-fetches the upstream token using stored credentials, so allowing anonymous bypass would let any external caller invoke tools authenticated as LiteLLM's service account. Fix: check `server.has_client_credentials` in `_target_servers_delegate_auth_to_upstream`, the anonymous allow-list in `get_allowed_mcp_servers`, and `_mcp_oauth_user_api_key_auth`. 2. Medium – internal-only servers exposed to public internet: The anonymous delegate allow-list was not filtering by `available_on_public_internet`, so external callers with an upstream OAuth token could invoke tools on servers marked internal-only. Fix: add `available_on_public_internet` guard to the anonymous delegate server list in `get_allowed_mcp_servers`. Tests added for both cases. Co-authored-by: Cursor <cursoragent@cursor.com> * Require public MCP delegate auth servers Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(mcp): align delegate auth path parsing with downstream routing `_extract_target_server_names_from_path` used a naive segments-based split while `server.py::_get_mcp_servers_in_path` uses a regex that allows server names with one embedded slash and comma-separated lists. With the old parser, a request to `/mcp/<delegated>/<garbage>` was parsed as targeting `<delegated>` by the auth gate (bypassing LiteLLM auth) while the routing layer parsed it as `<delegated>/<garbage>` — when that name did not resolve, the request fell back to the anonymous allow-list, which can include `allow_all_keys` servers that normally require a LiteLLM key. Replace the parser with the same regex logic as `_get_mcp_servers_in_path` so auth gating sees the exact target name(s) downstream routing sees. Add regression tests covering parser parity and the specific extra-path-segment bypass attempt. https://claude.ai/code/session_01SjyPmwfmrq8fveFgw9iHW9 * fix(mcp): close header/path TOCTOU in MCP delegate auth gate `_target_servers_delegate_auth_to_upstream` and `_target_servers_use_oauth2` trusted the `x-mcp-servers` header when present, but `server.py::extract_mcp_auth_context` overrides that header with the path-derived list for `/mcp/...` routes. An attacker could set `x-mcp-servers: <delegated>` while pointing the URL path at a non-delegate server, flipping the auth gate without changing the target downstream routing actually uses. Extract a shared `_resolve_target_server_names` helper that mirrors the downstream override (path-derived names for `/mcp/...` routes, header value otherwise). Add regression tests covering the TOCTOU attempt and the helper's path-vs-header precedence. https://claude.ai/code/session_01SjyPmwfmrq8fveFgw9iHW9 * Fix delegated MCP OAuth test mock Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(mcp): drop unreachable /{server}/mcp branch in auth path parser `_extract_target_server_names_from_path` also matched the ``/{server_name}/mcp`` form, but the downstream parser ``_get_mcp_servers_in_path`` only handles ``/mcp/...`` — and ``dynamic_mcp_route`` in ``proxy_server`` rewrites ``/{name}/mcp`` to ``/mcp/{name}`` on the scope before the MCP handler runs. Parsing the un-rewritten form on the auth side was therefore unreachable in production, and contradicted the docstring's claim of mirroring the downstream parser — exactly the kind of mismatch that risks a future header/path TOCTOU if any new entry point skips the rewrite. Drop the branch; the canonical ``/mcp/...`` path matches both parsers. Update the regression test to assert the new behavior. https://claude.ai/code/session_01SjyPmwfmrq8fveFgw9iHW9 * Fix MCP path auth target resolution Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(mcp): require auth for refresh_token grants on delegate-auth servers `_mcp_oauth_user_api_key_auth` gates the unauthenticated PKCE flow for ``delegate_auth_to_upstream`` servers, but the bypass applied to BOTH ``/authorize`` and ``/token`` regardless of grant type. ``mcp_token`` accepts ``grant_type=refresh_token`` as well as ``authorization_code``, and ``exchange_token_with_server`` attaches the server's stored ``client_secret`` to whatever is forwarded upstream. An unauthenticated caller holding a refresh token issued to that OAuth client could mint fresh upstream access tokens through LiteLLM. Limit the anonymous bypass on ``/token`` to ``grant_type=authorization_code`` (the only grant PKCE actually protects via ``code_verifier``); fall through to normal LiteLLM auth for ``refresh_token`` and any other grant. ``/authorize`` continues to allow anonymous PKCE redirects. https://claude.ai/code/session_01SjyPmwfmrq8fveFgw9iHW9 * fix(ui): clear delegate_auth_to_upstream when switching off oauth2 The ``delegate_auth_to_upstream`` form field is rendered inside an ``isOAuth2 && (...)`` conditional, so the Form.Item unmounts when the user changes ``auth_type`` away from ``oauth2``. The follow-up ``form.setFieldValue("delegate_auth_to_upstream", false)`` runs after the field has already deregistered, so ``onFinish`` receives ``undefined`` and the fallback ``?? mcpServer.delegate_auth_to_upstream`` preserved the old ``true``. The flag then persisted in the database for a non-oauth2 server and silently re-activated if ``auth_type`` was later switched back to ``oauth2``. In the edit payload, force the flag to ``false`` whenever ``auth_type !== oauth2``; only trust the form value (and the existing DB fallback) when the server is actually oauth2. Backend defense-in-depth already ignores the flag for non-oauth2 servers, but the DB state should stay clean too. https://claude.ai/code/session_01SjyPmwfmrq8fveFgw9iHW9 * Fix MCP delegate auth reset on edit Co-authored-by: Yassin Kortam <yassin@berri.ai> --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Yassin Kortam <yassin@berri.ai> Co-authored-by: Claude <claude@anthropic.com> * fix(responses): preserve cache_control in Responses API -> Chat Completion transformation (#27727) * fix(responses): preserve cache_control in Responses API -> Chat Completion transformation cache_control injected by AnthropicCacheControlHook was silently dropped when _transform_responses_api_content_to_chat_completion_content rebuilt content blocks with only {type, text}. Now copies cache_control through so Anthropic prompt caching works correctly when using client.responses.create with cache_control_injection_points. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(responses): preserve cache_control for input_image and input_file blocks Extends the cache_control fix to image and file content blocks, which were also silently dropping cache_control during the Responses API -> Chat Completion transformation. Adds tests for all three content block types. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Claude Babysitter <claude@anthropic.com> * fix(proxy): expose db status on public /health/readiness External readiness probes consumed the legacy detailed payload's `db` field to drive alerting and pod-rotation decisions. Stripping the body to `{"status": "healthy"}` broke those probes silently — the HTTP code still flipped to 503, but probes checking `body.db == "connected"` treated the response as healthy. Add `db` back to the unauthenticated payload. Keep the rest of the diagnostic fields (litellm_version, callbacks, cache, log_level) gated behind /health/readiness/details so the recon-leak gate from #26912 holds. Values match the legacy contract: "connected", "disconnected", "Not connected". * docs(budget_manager): add docstring to BudgetManager.reset_cost (#27867) Co-authored-by: oss-agent-shin <279349115+oss-agent-shin@users.noreply.github.com> * docs: add class docstring to _LoopWrapper (#27870) Document the purpose of the daemon thread that backs the sync branch of the timeout decorator. Co-authored-by: oss-agent-shin <279349115+oss-agent-shin@users.noreply.github.com> * fix: Fix Redis Sentinel client handling to solve authentication error… (#26302) * fix: Fix Redis Sentinel client handling to solve authentication error with password protected sentinel (#25625) * fix Redis Sentinel authentication handling * test: cover Redis Sentinel auth routing * refactor: align Redis Sentinel kwargs threading * fix: avoid duplicate Redis Sentinel socket timeouts * Address review comments * refactor(_redis): return set from _get_redis_kwargs for O(1) lookup Align _get_redis_kwargs() with the cluster helper by returning a set instead of a list, so the sentinel connection-kwargs filter uses O(1) membership tests. Addresses Greptile review feedback on PR #26302. * fix(_redis): restore Azure-specific kwargs in cluster kwargs set The set-literal refactor of _get_redis_cluster_kwargs dropped four LiteLLM-custom Azure keys (azure_redis_ad_token, azure_client_id, azure_tenant_id, azure_client_secret) that the prior list form had explicitly appended. Because they are not in RedisCluster's argspec, they were silently stripped, breaking Azure IAM auth on cluster clients. Re-add them to the explicit include set. --------- Co-authored-by: Kristin Cowalcijk <kristincowalcijk@gmail.com> Co-authored-by: Sameer Kankute <sameer@berri.ai> Co-authored-by: krrish-berri-2 <krrish-berri-2@users.noreply.github.com> Co-authored-by: claude <claude@anthropic.com> * Litellm agent oss staging 05 11 2026 (#27733) * fix(ollama): Include provider in model list for ollama (#26135) * Include provider in model names for ollama * Fix unit tests * fix(ollama): process both thinking and content in same streaming chunk (#26098) * fix(health_check): skip max_tokens for image_generation mode (#26417) * fix(health_check): skip max_tokens for image_generation mode `_update_litellm_params_for_health_check` injected `max_tokens` for every deployment. OpenAI `/v1/images/generations` strictly rejects unknown fields, so health checks for dall-e-* and gpt-image-1 always failed with `400 "Unknown parameter: 'max_tokens'"` even though the actual image endpoint calls succeed. Skip the `max_tokens` injection when `model_info.mode == "image_generation"`. `messages` still gets injected (downstream `_filter_model_params` already strips it for non-chat handlers). * Switch to allow-list with per-deployment override Per @krrishdholakia review: deny-listing image_generation only re-introduces the same bug for every other non-chat mode (embedding, audio_*, rerank, video_generation, ocr, search, moderation, ...). Replace the single image_generation skip with `_MAX_TOKEN_SUPPORT_MODES = {chat, completion, responses}`. Missing `mode` is treated as chat for backward compatibility. New modes are safe by default. Add `model_info.health_check_supports_max_tokens` as an operator escape hatch — True forces injection on a non-listed deployment (operator wants to bound probe tokens), False suppresses it on a chat-style deployment behind a strict-schema provider. Tests: parametrize over 3 chat-style + 10 non-chat modes, plus override on/off and the no-mode legacy path. * fix(http_handler): handle RequestNotRead in MaskedHTTPStatusError for multipart uploads (#26718) Squash-merged by litellm-agent from dawidkulpa's PR. * fix(ollama): guard against double 'ollama/' prefix in live model listing Greptile flagged that Ollama servers can return names that already start with 'ollama/'. Check the prefix before prepending so we don't produce 'ollama/ollama/...'. Adds a regression test. * Fix Ollama empty reasoning stream chunks Co-authored-by: Yassin Kortam <yassin@berri.ai> --------- Co-authored-by: James Myatt <james@jamesmyatt.co.uk> Co-authored-by: VHash <225398745+vhash0@users.noreply.github.com> Co-authored-by: hayden <sewhan.kim+@a-bly.com> Co-authored-by: dawidkulpa <84176950+dawidkulpa@users.noreply.github.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Yassin Kortam <yassin@berri.ai> * Ishaan - May 13th Staging LiteLLM (#27877) * fix: strip Gemini thought-signature from tool_use.id in non-streaming path; example websearch config (#27873) - adapters/transformation.py: mirror the streaming path and strip the `__thought__<b64>` suffix off `tool_call.id` before building the AnthropicResponseContentBlockToolUse. Base64's `+ / =` characters violate Anthropic's `^[a-zA-Z0-9_-]+$` tool_use.id pattern, so when a conversation that flowed through Gemini is later replayed to an Anthropic-native provider (Bedrock or Anthropic API) the request 400s. - example_config_yaml/websearch_interception_config.yaml: register the interceptor under `callbacks:` not `success_callback:`. `success_callback` does not run pre-request hooks, so the tool-conversion step never fires on `/v1/messages` and the raw `web_search_20250305` tool is forwarded to Bedrock, which 400s. - adds a unit test pinning the non-streaming strip behavior and the surviving `^[a-zA-Z0-9_-]+$` shape of the resulting id. Co-authored-by: oss-agent-shin <279349115+oss-agent-shin@users.noreply.github.com> * Fix/azure image edit auth header (#27863) * fix(azure/image_edit): use api-key header instead of Authorization Bearer Delegate `AzureImageEditConfig.validate_environment` to `BaseAzureLLM._base_validate_azure_environment` so the image-edit route follows the same auth resolution as every other Azure provider: - prefer the Azure-native `api-key` header when an API key is available - fall back to `Authorization: Bearer <azure_ad_token>` only for AAD auth The previous implementation unconditionally set `Authorization: Bearer <api_key>`, which is the OpenAI-direct convention and is rejected by Azure OpenAI / APIM-fronted deployments with `401 Access denied due to missing subscription key`. Adds regression tests covering api_key kwarg, litellm_params.api_key, and the AAD-token fallback path. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(azure/image_edit): pin api-key precedence semantics + add regression test Address review feedback that the move to ``BaseAzureLLM._base_validate_azure_environment`` changed the relative priority of the positional ``api_key`` kwarg vs. ``litellm_params["api_key"]``. The new behavior — ``litellm_params["api_key"]`` wins, positional only fills in when ``litellm_params["api_key"]`` is empty — is intentional and matches every other Azure ``validate_environment``: ``AzureVideosConfig`` uses the exact same merge logic, while ``AzureVectorStoresConfig`` and ``AzureResponsesAPIConfig`` don't accept a positional ``api_key`` at all. The old ``or`` chain (positional wins) was the outlier and was part of the same OpenAI-vs-Azure convention drift that produced the original ``Authorization: Bearer`` bug. The only production caller (``llm_http_handler.image_edit``) sources both values from the same ``litellm_params.api_key``, so this change is behaviorally a no-op there. Document the precedence in the docstring and lock it in with an explicit test so future refactors can't quietly re-invert it. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: yuneng-jiang <yuneng@berri.ai> Co-authored-by: ryan-crabbe-berri <ryan@berri.ai> Co-authored-by: Adam Kirstein <adam.kirstein@disney.com> Co-authored-by: Cursor <cursoragent@cursor.com> * test(azure/image_edit): expect api-key header instead of Authorization Bearer PR #27863 fixed Azure image edit to use the Azure-native api-key header instead of OpenAI's Authorization: Bearer convention, but did not update test_azure_image_edit_litellm_sdk to match. The test still asserted 'Authorization' in headers, which now fails since the new code routes through BaseAzureLLM._base_validate_azure_environment and emits api-key when an api_key is provided. Update the assertion to pin the correct Azure behavior: api-key header present with the resolved key, and no Authorization header. --------- Co-authored-by: oss-agent-shin <ext-agent-shin@berri.ai> Co-authored-by: oss-agent-shin <279349115+oss-agent-shin@users.noreply.github.com> Co-authored-by: Adam Kirstein <107421694+justalittleadam@users.noreply.github.com> Co-authored-by: yuneng-jiang <yuneng@berri.ai> Co-authored-by: ryan-crabbe-berri <ryan@berri.ai> Co-authored-by: Adam Kirstein <adam.kirstein@disney.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Ishaan Jaffer <ishaanjaffer0324@gmail.com> * fix(fireworks_ai): strip `thinking_blocks` from chat messages before Fireworks API call (#27881) * fix(fireworks_ai): strip thinking_blocks from chat messages before API call Fireworks OpenAI-compatible ChatMessage schema uses additionalProperties:false and rejects Anthropic-style messages[].thinking_blocks (e.g. Claude Code replays), returning invalid_request_error. Remove the field in _transform_messages_helper alongside provider_specific_fields. Adds unit test test_transform_messages_helper_strips_thinking_blocks. Co-authored-by: Cursor <cursoragent@cursor.com> * chore(fireworks_ai): drop inline comments from message sanitization Co-authored-by: Cursor <cursoragent@cursor.com> * docs(fireworks_ai): explain why provider_specific_fields and thinking_blocks are stripped Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> * fix: block client-side pricing injection via request body Authenticated clients could supply CustomPricingLiteLLMParams fields (input_cost_per_token, output_cost_per_token, etc.) in the request body. These were forwarded to register_model() in main.py, permanently mutating the shared global litellm.model_cost dict for all users on the instance. Adds all CustomPricingLiteLLMParams fields to _BANNED_REQUEST_BODY_PARAMS so is_request_body_safe() rejects them before they reach completion(). New pricing fields added to CustomPricingLiteLLMParams are auto-covered. Admin opt-in via allow_client_side_credentials or configurable_clientside_auth_params still works as before. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> * chore(proxy): scrub remote-URL module loads from DB-overlay config When ``ProxyConfig`` merges DB-persisted ``litellm_settings`` / ``general_settings`` on top of the YAML config, the merged dict is later iterated by ``load_config`` which threads ``config_file_path`` (the YAML path) into ``get_instance_fn``. The runtime gate that refuses ``s3://`` / ``gcs://`` modules when ``config_file_path`` is ``None`` therefore can't distinguish a YAML-sourced value from a DB-sourced one: both look the same to ``get_instance_fn``. Strip ``s3://`` / ``gcs://`` entries from the DB-overlay value for every field whose contents reach ``get_instance_fn`` during config load: - litellm_settings: ``callbacks``, ``success_callback``, ``failure_callback``, ``audit_log_callbacks``, ``post_call_rules``, ``custom_provider_map[].custom_handler`` - general_settings: ``custom_auth``, ``custom_key_generate``, ``custom_key_update``, ``custom_sso``, ``custom_ui_sso_sign_in_handler``, ``litellm_jwtauth.custom_validate`` The YAML config-file load path is unchanged — the documented operator flow (``callbacks: ["s3://bucket/module.instance"]`` in ``config.yaml``) still works. Only DB-overlay writes (e.g. via ``/config/update``) are stripped. Adds 16 regression tests covering the scrub matrix. * chore(proxy): also scrub pass_through_endpoints[].target from DB overlay A pass-through endpoint's ``target`` field is passed through ``create_pass_through_route`` into ``get_instance_fn`` during config load. A PROXY_ADMIN persisting ``target: "s3://attacker/m.i"`` via the DB-overlay ``pass_through_endpoints`` write path was not covered by the previous scrub matrix, so the remote module load would still reach the loader because the YAML-load chain has ``config_file_path`` set. Walk each entry in ``general_settings.pass_through_endpoints`` and null out any ``target`` that starts with ``s3://`` or ``gcs://``. The entry itself is preserved so the path-registration helper can choose how to handle a missing target (the existing code skips the route when ``target is None``). Adds two regression tests. * fix(prometheus): emit `litellm_remaining_tokens_metric` for Bedrock and Vertex (#27705) * fix(prometheus): emit remaining_tokens/requests gauges for bedrock + vertex (LIT-2719) Bedrock and Vertex AI never return x-ratelimit-remaining-* response headers, so litellm_remaining_tokens_metric / litellm_remaining_requests_metric only fired for OpenAI / Azure / Anthropic deployments even when tpm/rpm was configured on the router. Add a provider-agnostic fallback in PrometheusLogger.async_log_success_event that asks Router.get_remaining_model_group_usage() for the same model_group and emits the gauges with configured_limit - current_usage when the upstream provider didn't populate the headers itself. Existing OpenAI / Azure / Anthropic flows are unchanged because the fallback short-circuits when both header values are already present. Tests: 8 new tests covering bedrock + vertex emission, header short-circuit, partial-header fill, llm_router=None, missing model_group, empty router result, and router exception swallowing. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> * fix(prometheus): narrow except to ImportError, log router lookup failures via verbose_logger.exception Address greptile review: - The optional 'from litellm.proxy.proxy_server import llm_router' should guard against ImportError specifically, not all exceptions, so that unexpected errors (e.g. AttributeError from partially-initialized state) stay visible. - get_remaining_model_group_usage failures are now logged via verbose_logger.exception (with traceback) instead of debug, matching the PR description's intent and avoiding silent loss of router-cache errors in production. Co-authored-by: Mateo Wang <mateo-berri@users.noreply.github.com> * fix(prometheus): subtract in-flight delta in router-remaining fallback The router's TPM/RPM counter is incremented by Router.deployment_callback_on_success, which f…

yuneng-berri and others added 4 commits May 7, 2026 18:05

Merge pull request BerriAI#27436 from BerriAI/litellm_internal_staging

fa81017

[Infra] Promote Internal Staging to main

Merge pull request BerriAI#27559 from BerriAI/litellm_internal_staging

e182a5e

[Infra] Promote Internal Staging to main

Merge pull request BerriAI#27815 from BerriAI/litellm_internal_staging

7af0f05

[Infra] Promote internal staging to main

greptile-apps Bot reviewed May 13, 2026

View reviewed changes

Comment thread litellm/llms/azure/image_edit/transformation.py

cursor Bot reviewed May 13, 2026

View reviewed changes

ishaan-berri changed the base branch from litellm_internal_staging to litellm_ishaan_may13 May 13, 2026 22:52

ishaan-berri merged commit fa632b0 into BerriAI:litellm_ishaan_may13 May 13, 2026
43 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/azure image edit auth header#27863

Fix/azure image edit auth header#27863
ishaan-berri merged 5 commits into
BerriAI:litellm_ishaan_may13from
justalittleadam:fix/azure-image-edit-auth-header

justalittleadam commented May 13, 2026 •

edited by cursor Bot

Loading

Uh oh!

CLAassistant commented May 13, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 13, 2026

Important Files Changed

Uh oh!

codecov Bot commented May 13, 2026

Uh oh!

Uh oh!

oss-pr-review-agent-shin Bot commented May 13, 2026

Uh oh!

ishaan-berri commented May 13, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

justalittleadam commented May 13, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Type

Changes

litellm/llms/azure/image_edit/transformation.py

Why delegate instead of just renaming the header

tests/test_litellm/llms/azure/image_edit/test_azure_image_edit_transformation.py

Uh oh!

CLAassistant commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented May 13, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

codecov Bot commented May 13, 2026

Codecov Report

Uh oh!

Uh oh!

oss-pr-review-agent-shin Bot commented May 13, 2026

Uh oh!

ishaan-berri commented May 13, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

justalittleadam commented May 13, 2026 •

edited by cursor Bot

Loading

`litellm/llms/azure/image_edit/transformation.py`

`tests/test_litellm/llms/azure/image_edit/test_azure_image_edit_transformation.py`

CLAassistant commented May 13, 2026 •

edited

Loading