fix(guardrails): read CrowdStrike AIDR identity from both metadata bags#29991
Conversation
Greptile SummaryThis PR fixes a silent identity-drop bug in the CrowdStrike AIDR guardrail where
Confidence Score: 5/5Safe to merge; the change is a minimal, targeted fix to a single helper that builds a guard payload, and the proxy's pre-call stripping of user_api_key_* keys from both bags ensures no forged identity can surface through the merge. The fix is straightforward and self-contained. The proxy already strips user_api_key_* from caller-supplied metadata before guardrails run (litellm_pre_call_utils.py lines 1516–1531), so the new merge-then-read pattern cannot be exploited to forge an identity. The merge order (metadata first, litellm_metadata second) is correct for both the /chat/completions path (identity in metadata) and the assistant/thread path (identity in litellm_metadata). The new parametrized test is well-constructed and directly targets the regression. No files require special attention.
|
| Filename | Overview |
|---|---|
| litellm/proxy/guardrails/guardrail_hooks/crowdstrike_aidr/crowdstrike_aidr.py | Replaces single-bag identity resolution with a two-bag merge loop; the merge is safe because litellm_pre_call_utils already strips user_api_key_* keys from both bags before guardrails run. |
| tests/test_litellm/proxy/guardrails/guardrail_hooks/test_crowdstrike_aidr.py | Adds a parametrized test covering four metadata-bag combinations (litellm_metadata null, user dict, non-mapping, and identity-in-litellm_metadata); no existing tests modified, mocks remain unchanged. |
Reviews (1): Last reviewed commit: "fix(guardrails): read CrowdStrike AIDR i..." | Re-trigger Greptile
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first
3f34d4f to
92ca38d
Compare
…gs (#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first (cherry picked from commit 1bbaf1c)
…gs (#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first (cherry picked from commit 1bbaf1c)
…gs (#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first (cherry picked from commit 1bbaf1c)
…gs (#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first (cherry picked from commit 1bbaf1c)
…AIDR, Mantle SigV4, NetApp streaming-cost fix, and team-scoped Datadog toward v1.89.0-rc.3 (#30179) * fix(proxy): authorize batch files using upload target_model_names (LIT-3593) (#30009) * fix(proxy): authorize batch files using upload target_model_names (LIT-3593) After replace_model_in_jsonl, body.model is a stripped provider id. Reverse-mapping it via resolve_model_name_from_model_id is first-match on model_list and caused false 403s when multiple deployments share the same stripped name. Use target_model_names from the unified file id instead. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593) Restores the reverse-lookup for the JSONL body.model fallback path so that legacy/pre-target_model_names managed files still map stripped provider IDs back to proxy aliases before auth. Also cleans up redundant `or None`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "fix(proxy): restore resolve_model_name_from_model_id for JSONL fallback path (LIT-3593)" This reverts commit 30d2e96. --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 2cd7e87) * feat(guardrails): capture user and model metadata in CrowdStrike AIDR (cherry picked from commit 6fc715c) * fix(guardrails): read CrowdStrike AIDR identity from both metadata bags (#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first (cherry picked from commit 1bbaf1c) * feat(bedrock_mantle): add SigV4/IAM auth to Responses API route (#29788) Applied as the squash diff of PR #29788 (head 9800b2f), which landed upstream inside the litellm_oss_staging_080626 sync (32c88ca, #29932) and has no standalone commit to cherry-pick. The rc line already carries the prerequisite #29490 Responses route via the 040626 sync. * fix: completion_cost AttributeError on streaming Anthropic web_search responses (#26153) (#27346) Cherry-picked from staging squash 4a3860d. The rc line predates the Usage.__init__ server_tool_use dict->ServerToolUse coercion that staging carries (it landed via the squashed OSS sync #29932 / 32c88ca, not as a standalone commit). The calculate_usage Usage(**returned_usage.model_dump()) round-trip re-serializes server_tool_use to a plain dict, so without that coercion the rebuilt usage holds a dict and the regression test asserting a ServerToolUse type fails. Restored the coercion in litellm/types/utils.py to satisfy the prerequisite -- it matches #27346's own first commit (coerce server_tool_use dict to ServerToolUse in Usage.__init__), which was dropped from the squash only because staging already carried it. * feat(datadog): add team-scoped Datadog callback support (#29947) Cherry-picked from the PR head 9c049da (single-commit PR, merged to litellm_oss_branch). Applied cleanly; no conflicts. Note: black --check in this worktree flags pre-existing multi-line string formatting in litellm_core_utils/litellm_logging.py (lines ~1006-1050) that is already present on the patch/v1.89.0-rc.1 base and is untouched by this pick -- left as-is to avoid reformatting unrelated lines. --------- Co-authored-by: Sameer Kankute <sameer@berri.ai> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Kenan Yildirim <kenan@kenany.me> Co-authored-by: yuneng-jiang <yuneng@berri.ai> Co-authored-by: Kent <kingdooo@gmail.com> Co-authored-by: ishaan-berri <155045088+ishaan-berri@users.noreply.github.com> Co-authored-by: aanchal22 <12680748+aanchal22@users.noreply.github.com>
…gs (BerriAI#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first (cherry picked from commit 1bbaf1c)
…gs (BerriAI#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first
…gs (BerriAI#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first
…gs (BerriAI#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first
…gs (BerriAI#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first
…gs (BerriAI#29991) Capture user_id and extra_info from metadata or litellm_metadata. The single-bag read dropped identity whenever a request carried a present litellm_metadata field (null or a user-supplied dict), since /chat/completions routes the authenticated identity into metadata while the guardrail read litellm_metadata first (cherry picked from commit 1bbaf1c)
Relevant issues
Follow-up to #29517, which added user and model metadata capture to the CrowdStrike AIDR guardrail. That change resolved the metadata from a single bag with
request_data.get("litellm_metadata", request_data.get("metadata")). Becausedict.getreturns the stored value when the key is present, a presentlitellm_metadata(an explicitnull, or a caller-supplied dict used for tracing) shadowsmetadata, which is where/chat/completionsactually places the authenticated identity. The result is thatuser_idandextra_info.user_namewere silently dropped from the outbound guard payload for any request that carried alitellm_metadatafield. The common case (nolitellm_metadatain the body) still worked, so the original tests passedLinear ticket
N/A
Pre-Submission checklist
make test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewScreenshots / Proof of Fix
Verified end to end against a live proxy. The chat completion hits real Anthropic; a local HTTP server stands in for the CrowdStrike AIDR endpoint (we have no CrowdStrike account) and records the exact JSON the proxy sends, which is the only thing this change affects. A virtual key is bound to a user that has an email so
user_api_key_user_idanduser_api_key_user_emailare populated.Three request shapes, run once before the fix and once after:
What the proxy sent to CrowdStrike (request-side guard call), before the fix:
After the fix:
Unit tests on the touched file:
The new parametrized test
test_apply_guardrail_reads_identity_from_either_metadata_bagfails on the pre-fix code (KeyError onuser_id) for thenull, user-dict, and non-mappinglitellm_metadatacases, and passes afterType
🐛 Bug Fix
Changes
The guardrail now reads identity from both metadata bags instead of choosing one by presence. It merges
metadataandlitellm_metadata(each only when it is a Mapping) and readsuser_api_key_user_idanduser_api_key_user_emailfrom the merged view, so the identity is found regardless of which bag the proxy wrote it to. This is safe against caller forgery because the proxy strips everyuser_api_key_*key from caller-supplied metadata before guardrails run and writes the authenticated values into exactly one bag, so there is no collision and no way for a caller to inject an identity. Behavior is unchanged when neither bag is present (nouser_id,model, orextra_infoadded) and when only one bag carries the identity