Skip to content

fix(guardrails): read CrowdStrike AIDR identity from both metadata bags#29991

Merged
yuneng-berri merged 1 commit into
litellm_internal_stagingfrom
litellm_/kind-leakey-e57eca
Jun 9, 2026
Merged

fix(guardrails): read CrowdStrike AIDR identity from both metadata bags#29991
yuneng-berri merged 1 commit into
litellm_internal_stagingfrom
litellm_/kind-leakey-e57eca

Conversation

@yuneng-berri

Copy link
Copy Markdown
Collaborator

Relevant issues

Follow-up to #29517, which added user and model metadata capture to the CrowdStrike AIDR guardrail. That change resolved the metadata from a single bag with request_data.get("litellm_metadata", request_data.get("metadata")). Because dict.get returns the stored value when the key is present, a present litellm_metadata (an explicit null, or a caller-supplied dict used for tracing) shadows metadata, which is where /chat/completions actually places the authenticated identity. The result is that user_id and extra_info.user_name were silently dropped from the outbound guard payload for any request that carried a litellm_metadata field. The common case (no litellm_metadata in the body) still worked, so the original tests passed

Linear ticket

N/A

Pre-Submission checklist

  • I have added meaningful tests
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible; it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Screenshots / Proof of Fix

Verified end to end against a live proxy. The chat completion hits real Anthropic; a local HTTP server stands in for the CrowdStrike AIDR endpoint (we have no CrowdStrike account) and records the exact JSON the proxy sends, which is the only thing this change affects. A virtual key is bound to a user that has an email so user_api_key_user_id and user_api_key_user_email are populated.

Three request shapes, run once before the fix and once after:

KEY=...   # key bound to a user with email alice@example.com
for BODY in \
  '{"model":"e2e-haiku","messages":[{"role":"user","content":"hello"}]}' \
  '{"model":"e2e-haiku","messages":[{"role":"user","content":"hello"}],"litellm_metadata":null}' \
  '{"model":"e2e-haiku","messages":[{"role":"user","content":"hello"}],"litellm_metadata":{"trace_id":"t1"}}'; do
  curl -s http://localhost:4001/v1/chat/completions \
    -H "Authorization: Bearer $KEY" -H "Content-Type: application/json" -d "$BODY" -o /dev/null
done
# then read the request-side payloads the proxy POSTed to the CrowdStrike endpoint

What the proxy sent to CrowdStrike (request-side guard call), before the fix:

no litellm_metadata        -> user_id=cs-e2e-user  extra_info={"user_name":"alice@example.com"}
litellm_metadata: null     -> user_id=null         extra_info absent        # identity dropped
litellm_metadata: {trace}  -> user_id=null         extra_info={}            # identity dropped

After the fix:

no litellm_metadata        -> user_id=cs-e2e-user  extra_info={"user_name":"alice@example.com"}
litellm_metadata: null     -> user_id=cs-e2e-user  extra_info={"user_name":"alice@example.com"}
litellm_metadata: {trace}  -> user_id=cs-e2e-user  extra_info={"user_name":"alice@example.com"}

Unit tests on the touched file:

$ poetry run pytest -q tests/test_litellm/proxy/guardrails/guardrail_hooks/test_crowdstrike_aidr.py
17 passed

The new parametrized test test_apply_guardrail_reads_identity_from_either_metadata_bag fails on the pre-fix code (KeyError on user_id) for the null, user-dict, and non-mapping litellm_metadata cases, and passes after

Type

🐛 Bug Fix

Changes

The guardrail now reads identity from both metadata bags instead of choosing one by presence. It merges metadata and litellm_metadata (each only when it is a Mapping) and reads user_api_key_user_id and user_api_key_user_email from the merged view, so the identity is found regardless of which bag the proxy wrote it to. This is safe against caller forgery because the proxy strips every user_api_key_* key from caller-supplied metadata before guardrails run and writes the authenticated values into exactly one bag, so there is no collision and no way for a caller to inject an identity. Behavior is unchanged when neither bag is present (no user_id, model, or extra_info added) and when only one bag carries the identity

@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a silent identity-drop bug in the CrowdStrike AIDR guardrail where user_id and extra_info.user_name were not sent to the guard endpoint whenever the caller included a litellm_metadata key in the request body (even as null), because dict.get("litellm_metadata", fallback) returns the stored None rather than the fallback.

  • The fix replaces the single dict.get with a loop that iterates both metadata and litellm_metadata bags, merging each into a fresh dict only when it is a Mapping, then reads identity from the merged view.
  • A new parametrized test covers the four problematic shapes (null, trace dict, non-mapping, and identity-in-litellm_metadata) and is verified to fail on the pre-fix code.

Confidence Score: 5/5

Safe to merge; the change is a minimal, targeted fix to a single helper that builds a guard payload, and the proxy's pre-call stripping of user_api_key_* keys from both bags ensures no forged identity can surface through the merge.

The fix is straightforward and self-contained. The proxy already strips user_api_key_* from caller-supplied metadata before guardrails run (litellm_pre_call_utils.py lines 1516–1531), so the new merge-then-read pattern cannot be exploited to forge an identity. The merge order (metadata first, litellm_metadata second) is correct for both the /chat/completions path (identity in metadata) and the assistant/thread path (identity in litellm_metadata). The new parametrized test is well-constructed and directly targets the regression.

No files require special attention.

Important Files Changed

Filename Overview
litellm/proxy/guardrails/guardrail_hooks/crowdstrike_aidr/crowdstrike_aidr.py Replaces single-bag identity resolution with a two-bag merge loop; the merge is safe because litellm_pre_call_utils already strips user_api_key_* keys from both bags before guardrails run.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_crowdstrike_aidr.py Adds a parametrized test covering four metadata-bag combinations (litellm_metadata null, user dict, non-mapping, and identity-in-litellm_metadata); no existing tests modified, mocks remain unchanged.

Reviews (1): Last reviewed commit: "fix(guardrails): read CrowdStrike AIDR i..." | Re-trigger Greptile

@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first
@yuneng-berri yuneng-berri force-pushed the litellm_/kind-leakey-e57eca branch from 3f34d4f to 92ca38d Compare June 9, 2026 00:30
@yuneng-berri yuneng-berri enabled auto-merge (squash) June 9, 2026 00:42
@yuneng-berri yuneng-berri merged commit 1bbaf1c into litellm_internal_staging Jun 9, 2026
118 checks passed
@yuneng-berri yuneng-berri deleted the litellm_/kind-leakey-e57eca branch June 9, 2026 00:46
yuneng-berri added a commit that referenced this pull request Jun 11, 2026
…gs (#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first

(cherry picked from commit 1bbaf1c)
yuneng-berri added a commit that referenced this pull request Jun 11, 2026
yuneng-berri added a commit that referenced this pull request Jun 11, 2026
…gs (#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first

(cherry picked from commit 1bbaf1c)
yuneng-berri added a commit that referenced this pull request Jun 11, 2026
chore(release): backport #30064, #29991, #30009 to stable/1.86.x and cut 1.86.5
mateo-berri pushed a commit that referenced this pull request Jun 11, 2026
…gs (#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first

(cherry picked from commit 1bbaf1c)
mateo-berri pushed a commit that referenced this pull request Jun 11, 2026
…gs (#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first

(cherry picked from commit 1bbaf1c)
mateo-berri added a commit that referenced this pull request Jun 11, 2026
…AIDR, Mantle SigV4, NetApp streaming-cost fix, and team-scoped Datadog toward v1.89.0-rc.3 (#30179)

* fix(proxy): authorize batch files using upload target_model_names (LIT-3593) (#30009)

* fix(proxy): authorize batch files using upload target_model_names (LIT-3593)

After replace_model_in_jsonl, body.model is a stripped provider id. Reverse-mapping it via resolve_model_name_from_model_id is first-match on model_list and caused false 403s when multiple deployments share the same stripped name. Use target_model_names from the unified file id instead.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593)

Restores the reverse-lookup for the JSONL body.model fallback path so that
legacy/pre-target_model_names managed files still map stripped provider IDs
back to proxy aliases before auth. Also cleans up redundant `or None`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Revert "fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593)"

This reverts commit 30d2e96.

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 2cd7e87)

* feat(guardrails): capture user and model metadata in CrowdStrike AIDR

(cherry picked from commit 6fc715c)

* fix(guardrails): read CrowdStrike AIDR identity from both metadata bags (#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first

(cherry picked from commit 1bbaf1c)

* feat(bedrock_mantle): add SigV4/IAM auth to Responses API route (#29788)

Applied as the squash diff of PR #29788 (head 9800b2f), which landed
upstream inside the litellm_oss_staging_080626 sync (32c88ca, #29932)
and has no standalone commit to cherry-pick. The rc line already carries
the prerequisite #29490 Responses route via the 040626 sync.

* fix: completion_cost AttributeError on streaming Anthropic web_search responses (#26153) (#27346)

Cherry-picked from staging squash 4a3860d.

The rc line predates the Usage.__init__ server_tool_use dict->ServerToolUse
coercion that staging carries (it landed via the squashed OSS sync #29932 /
32c88ca, not as a standalone commit). The calculate_usage
Usage(**returned_usage.model_dump()) round-trip re-serializes server_tool_use
to a plain dict, so without that coercion the rebuilt usage holds a dict and the
regression test asserting a ServerToolUse type fails. Restored the coercion in
litellm/types/utils.py to satisfy the prerequisite -- it matches #27346's own
first commit (coerce server_tool_use dict to ServerToolUse in Usage.__init__),
which was dropped from the squash only because staging already carried it.

* feat(datadog): add team-scoped Datadog callback support (#29947)

Cherry-picked from the PR head 9c049da (single-commit PR, merged to
litellm_oss_branch). Applied cleanly; no conflicts.

Note: black --check in this worktree flags pre-existing multi-line string
formatting in litellm_core_utils/litellm_logging.py (lines ~1006-1050) that is
already present on the patch/v1.89.0-rc.1 base and is untouched by this pick --
left as-is to avoid reformatting unrelated lines.

---------

Co-authored-by: Sameer Kankute <sameer@berri.ai>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Kenan Yildirim <kenan@kenany.me>
Co-authored-by: yuneng-jiang <yuneng@berri.ai>
Co-authored-by: Kent <kingdooo@gmail.com>
Co-authored-by: ishaan-berri <155045088+ishaan-berri@users.noreply.github.com>
Co-authored-by: aanchal22 <12680748+aanchal22@users.noreply.github.com>
mrgizmo212 pushed a commit to ttgaillc/ttginf that referenced this pull request Jun 16, 2026
…gs (BerriAI#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first

(cherry picked from commit 1bbaf1c)
michaelxer pushed a commit to michaelxer/litellm that referenced this pull request Jun 17, 2026
…gs (BerriAI#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first
michaelxer pushed a commit to michaelxer/litellm that referenced this pull request Jun 17, 2026
…gs (BerriAI#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first
koladefaj pushed a commit to koladefaj/litellm that referenced this pull request Jun 17, 2026
…gs (BerriAI#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first
factnn pushed a commit to factnn/litellm that referenced this pull request Jun 18, 2026
…gs (BerriAI#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first
marioleonardo pushed a commit to marioleonardo/litellm-mongodb-patched that referenced this pull request Jun 19, 2026
…gs (BerriAI#29991)

Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first

(cherry picked from commit 1bbaf1c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants