Skip to content

fix(guardrails): return 400 not 500 when AIM blocks a request#30573

Merged
ryan-crabbe-berri merged 6 commits into
litellm_internal_stagingfrom
litellm_lit3751_aim_block_status_passthrough
Jun 17, 2026
Merged

fix(guardrails): return 400 not 500 when AIM blocks a request#30573
ryan-crabbe-berri merged 6 commits into
litellm_internal_stagingfrom
litellm_lit3751_aim_block_status_passthrough

Conversation

@ryan-crabbe-berri

@ryan-crabbe-berri ryan-crabbe-berri commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Linear ticket

Resolves LIT-3751

Summary

AIM guardrail rejections now return HTTP 400 with a well-formed OpenAI error body. Previously a block raised a bare HTTPException whose type and param serialized as the literal string "None", which broke OpenAI-SDK error parsing for downstream consumers (e.g. Google ADK). Making AIM raise a ProxyException fixed the body but exposed a second bug: the shared error funnel (_handle_llm_api_exception) re-derived the HTTP status from a nonexistent status_code attribute (ProxyException carries its status in .code) and downgraded the 400 to a 500. The funnel now honors an already-normalized ProxyException instead of rebuilding it, every AIM rejection path returns a conformant body, and the error-classification helpers in post_call_failure_hook treat ProxyException correctly: excluded from llm_exceptions paging (a policy block is not an infra failure) but still recorded by proxy-only failure logging.

Type

🐛 Bug Fix

Changes

  • litellm/proxy/guardrails/guardrail_hooks/aim/aim.py: all three AIM rejection paths (input block, output block, and the multimodal anonymize rejection) raise a ProxyException via a shared _rejection helper instead of a bare HTTPException, so type/param are no longer the literal string "None". The two real blocks carry openai_code="content_policy_violation"; the multimodal rejection stays a plain invalid_request_error because it is a usage error, not a policy violation
  • litellm/proxy/common_request_processing.py: _handle_llm_api_exception re-raises an already-normalized ProxyException (merging response headers) instead of re-deriving its status from status_code and clobbering it to 500. This honors the deliberate status for every hook that raises a ProxyException, not just AIM
  • litellm/proxy/utils.py: in post_call_failure_hook, ProxyException is excluded from the High-severity llm_exceptions alert alongside HTTPException (user-facing errors, not infra failures), and _is_proxy_only_llm_api_error now classifies ProxyException as a proxy-only error so guardrail blocks are still recorded by the configured failure loggers (restoring the pre-change HTTPException behavior). These two checks are independent

Deferred on purpose: serializing openai_code onto the wire so error.code reads content_policy_violation instead of 400. openai_code is currently write-only, and changing ProxyException.to_dict() would shift provider error codes at the re-wrap sites and desync the hand-built streaming error frame, with no consumer reading it today. Worth its own change if a client needs to branch on the semantic code.

Proof of Fix

Local proxy with the AIM guardrail pointed at a mock Aim analyze endpoint (real Aim is enterprise-gated); the clean request hits real groq.

Blocked request (prompt contains the configured trigger), now HTTP 400 with a conformant body:

```
$ curl -s -w "\nHTTP %{http_code}\n" -X POST http://localhost:4000/v1/chat/completions
-H "Authorization: Bearer $LITELLM_MASTER_KEY" -H "content-type: application/json"
-d '{"model":"groq-llama","messages":[{"role":"user","content":"My name is Leroy Jenkins"}]}'

{"error":{"message":""Leroy Jenkins" detected as name","type":"invalid_request_error","param":null,"code":"400"}}
HTTP 400
```

Clean request still succeeds against the real upstream:

```
$ curl -s -w "\nHTTP %{http_code}\n" -X POST http://localhost:4000/v1/chat/completions
-H "Authorization: Bearer $LITELLM_MASTER_KEY" -H "content-type: application/json"
-d '{"model":"groq-llama","messages":[{"role":"user","content":"Say hi in 3 words"}]}'

{"id":"chatcmpl-...","model":"groq-llama","object":"chat.completion","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello to you","role":"assistant"}}],...}
HTTP 200
```

Before this change the same blocked request returned HTTP 500 (status downgraded by the funnel), and on main the body carried "type":"None","param":"None".

Test plan

  • `tests/test_litellm/proxy/test_common_request_processing.py::TestHandleLLMApiExceptionDictDetail::test_already_normalized_proxy_exception_is_honored` - funnel re-raises a ProxyException with its 400 intact and asserts the wire body via to_dict(); fails (500) without the funnel fix
  • `tests/test_litellm/proxy/test_proxy_utils.py::TestPostCallFailureHookLLMExceptionAlerting` - ProxyException and HTTPException do not page; a genuine Exception still does
  • `tests/test_litellm/proxy/test_proxy_utils.py::TestPostCallFailureHookProxyExceptionLogging` - a ProxyException on an LLM route still drives proxy-only failure logging; a raw Exception does not; fails without the classifier fix
  • `tests/local_testing/test_aim_guardrails.py::test_block_callback` - input block raises a ProxyException with the correct type/param/code/openai_code
  • `tests/local_testing/test_aim_guardrails.py::test_output_block_raises_proxy_exception` - output-side block raises a conformant ProxyException with content_policy_violation
  • `tests/local_testing/test_aim_guardrails.py::test_anonymize_multimodal_rejection_raises_proxy_exception` - multimodal anonymize rejection raises invalid_request_error (400) and is NOT labeled a policy violation

AIM guardrail blocks raised a bare HTTPException whose type and param
serialized as the literal string "None", which broke OpenAI-SDK error
parsing for downstream consumers. Switching AIM to raise a ProxyException
surfaced a second bug: the shared error funnel re-derived the HTTP status
from a nonexistent status_code attribute and downgraded the 400 to a 500.
The funnel now honors an already-normalized ProxyException rather than
rebuilding it, and ProxyException is excluded from llm_exceptions alerting
so a content-policy block no longer pages on-call as an LLM API failure

Resolves LIT-3751
@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 63.63636% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...itellm/proxy/guardrails/guardrail_hooks/aim/aim.py 42.85% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

@greptile-apps

greptile-apps Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes two compounding bugs that caused AIM guardrail blocks to return HTTP 500 with a malformed OpenAI error body ("type":"None","param":"None") instead of a well-formed HTTP 400 response.

  • aim.py: All three rejection paths now raise ProxyException via the shared _rejection helper instead of a bare HTTPException, producing correct type/param fields; input and output blocks carry openai_code="content_policy_violation" while the multimodal-anonymize path stays a plain invalid_request_error.
  • common_request_processing.py: _handle_llm_api_exception now re-raises an already-normalized ProxyException (merging response headers) instead of re-deriving the HTTP status from a nonexistent status_code attribute and defaulting to 500.
  • utils.py: ProxyException is excluded from the high-severity llm_exceptions alert (it is a policy decision, not an infra failure) while _is_proxy_only_llm_api_error now includes it so blocked requests are still recorded by proxy-only failure loggers.

Confidence Score: 5/5

Safe to merge — the change is a targeted fix to two tightly scoped bugs (wrong exception type in guardrail hooks, wrong status re-derivation in the error funnel) with no impact on the happy path.

All three AIM rejection paths are covered by dedicated regression tests, the funnel fix has its own unit test asserting the exact wire body, and the alerting/logging classifier changes are verified by two new test classes. The ProxyException early-exit branch in the funnel is safe: ProxyException.headers is always a dict (initialized as headers or {} in __init__), so the dict-merge can't raise. No backwards-incompatible changes to existing callers — the new branch only activates for exceptions that previously fell through to a broken re-derivation path.

No files require special attention.

Important Files Changed

Filename Overview
litellm/proxy/guardrails/guardrail_hooks/aim/aim.py All three rejection paths (input block, output block, multimodal anonymize) now raise ProxyException via the shared _rejection helper instead of bare HTTPException, producing conformant type/param fields on the wire. The previous P2 about the multimodal path is resolved in this PR.
litellm/proxy/common_request_processing.py Adds an early-exit branch in _handle_llm_api_exception that merges response headers into an already-normalized ProxyException and re-raises it verbatim, preventing the status from being re-derived from a nonexistent status_code attribute and defaulted to 500.
litellm/proxy/utils.py Extends both post_call_failure_hook (LLM-exceptions alert guard) and _is_proxy_only_llm_api_error to include ProxyException alongside HTTPException, correctly classifying guardrail blocks as non-infra failures that still drive proxy-only logging.
tests/local_testing/test_aim_guardrails.py Adds three new regression tests (input block, output block, multimodal anonymize) covering all changed rejection paths with proper mocking; updates test_block_callback to assert stronger ProxyException attributes instead of just catching HTTPException.
tests/test_litellm/proxy/test_common_request_processing.py New test test_already_normalized_proxy_exception_is_honored verifies the funnel re-raises a ProxyException with its 400 intact and validates the wire body via to_dict().
tests/test_litellm/proxy/test_proxy_utils.py Two new test classes verify that ProxyException is excluded from high-severity LLM-exception alerts and is still recorded by proxy-only failure logging, matching the pre-fix HTTPException behavior.

Reviews (2): Last reviewed commit: "Merge remote-tracking branch 'origin/lit..." | Re-trigger Greptile

Comment thread litellm/proxy/guardrails/guardrail_hooks/aim/aim.py Outdated
@veria-ai

veria-ai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

PR overview

All previously flagged issues have been addressed. No open security concerns remain on this pull request.

Security review

No open security issues remain on this pull request.

Fixed/addressed: 1 · PR risk: 0/10

The block-action fix left two AIM rejection paths raising a bare
HTTPException: the multimodal anonymize rejection and the output-side
block. Both serialized type and param as the literal string "None", the
same malformed shape the block fix removed. Funnel all three through a
shared _rejection helper so they return a conformant OpenAI error body.
The output block carries content_policy_violation; the multimodal
rejection stays a plain invalid_request_error because it is a usage
error, not a policy violation

Resolves LIT-3751
Switching AIM blocks from HTTPException to ProxyException made
_is_proxy_only_llm_api_error return False for them, so
_handle_logging_proxy_only_error was skipped and the blocked prompt was
dropped from the configured failure loggers. Classify ProxyException as a
proxy-only error alongside HTTPException so guardrail blocks are recorded
again, matching the prior behavior. The llm_exceptions alert suppression
is a separate check and stays in place

Resolves LIT-3751
@ryan-crabbe-berri

Copy link
Copy Markdown
Collaborator Author

@greptileai re review

@ryan-crabbe-berri ryan-crabbe-berri merged commit b5fcd85 into litellm_internal_staging Jun 17, 2026
120 of 122 checks passed
@ryan-crabbe-berri ryan-crabbe-berri deleted the litellm_lit3751_aim_block_status_passthrough branch June 17, 2026 01:56
koladefaj pushed a commit to koladefaj/litellm that referenced this pull request Jun 17, 2026
…I#30573)

* fix(guardrails): return 400 not 500 when AIM blocks a request

AIM guardrail blocks raised a bare HTTPException whose type and param
serialized as the literal string "None", which broke OpenAI-SDK error
parsing for downstream consumers. Switching AIM to raise a ProxyException
surfaced a second bug: the shared error funnel re-derived the HTTP status
from a nonexistent status_code attribute and downgraded the 400 to a 500.
The funnel now honors an already-normalized ProxyException rather than
rebuilding it, and ProxyException is excluded from llm_exceptions alerting
so a content-policy block no longer pages on-call as an LLM API failure

Resolves LIT-3751

* fix(guardrails): route all AIM rejection paths through ProxyException

The block-action fix left two AIM rejection paths raising a bare
HTTPException: the multimodal anonymize rejection and the output-side
block. Both serialized type and param as the literal string "None", the
same malformed shape the block fix removed. Funnel all three through a
shared _rejection helper so they return a conformant OpenAI error body.
The output block carries content_policy_violation; the multimodal
rejection stays a plain invalid_request_error because it is a usage
error, not a policy violation

Resolves LIT-3751

* fix(guardrails): record AIM ProxyException blocks in failure logs

Switching AIM blocks from HTTPException to ProxyException made
_is_proxy_only_llm_api_error return False for them, so
_handle_logging_proxy_only_error was skipped and the blocked prompt was
dropped from the configured failure loggers. Classify ProxyException as a
proxy-only error alongside HTTPException so guardrail blocks are recorded
again, matching the prior behavior. The llm_exceptions alert suppression
is a separate check and stays in place

Resolves LIT-3751

* style(guardrails): use str | None over Optional[str] in AIM _rejection

* style(guardrails): collapse AIM _rejection signature per black
@balcsida balcsida mentioned this pull request Jun 18, 2026
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants