fix(logging): redact provider_specific_fields and request-body snapshots when message logging is off#28611
Conversation
…ots when message logging is off
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Greptile SummaryThis PR closes three prompt-leakage surfaces that survived the original
Confidence Score: 4/5Safe to merge; the fix is purely additive and all new code is guarded by isinstance checks so it cannot affect the non-redaction path. The implemented fixes are correct and consistent with existing in-place mutation patterns throughout perform_redaction. The two new body-snapshot helpers do not yet cover the system key (Anthropic/Bedrock native system prompt), which means a request with a top-level system prompt will still expose that content through proxy_server_request.body and complete_input_dict even after this patch. The PSF redaction uses a hardcoded key tuple that will silently miss any new reasoning-related fields future providers add to provider_specific_fields. litellm/litellm_core_utils/redact_messages.py — specifically the two new body-snapshot helpers and the PSF key tuple.
|
| Filename | Overview |
|---|---|
| litellm/litellm_core_utils/redact_messages.py | Adds three targeted helpers to close three prompt-leakage paths previously missed by perform_redaction; system/system_prompt/instructions keys are not yet covered in the two new body-snapshot helpers. |
| tests/test_litellm/litellm_core_utils/test_redact_messages.py | Adds 8 new unit tests covering all three new helpers across object and dict paths, plus safe-when-absent edge cases; no real network calls, no mock weakening, all assertions are additive. |
Reviews (1): Last reviewed commit: "fix(logging): redact provider_specific_f..." | Re-trigger Greptile
| if "messages" in body: | ||
| body["messages"] = [{"role": "user", "content": "redacted-by-litellm"}] | ||
| if "prompt" in body: | ||
| body["prompt"] = "" | ||
| if "input" in body: | ||
| body["input"] = "" |
There was a problem hiding this comment.
system key not redacted in proxy body snapshot
Both _redact_proxy_server_request_body and _redact_additional_args_complete_input_dict only clear messages, prompt, and input, leaving system (Anthropic/Bedrock native top-level system prompt), system_prompt, and instructions untouched. Any custom logger inspecting proxy_server_request.body.system or complete_input_dict.system will still receive the unredacted system prompt when turn_off_message_logging is enabled. The PR description calls this out as a known follow-up, but since these two helpers are the entry point for fixing the proxy-body and wire-request leaks, users who rely on this fix to prevent all prompt leakage will still be exposed through the system-prompt path.
| _PROVIDER_SPECIFIC_REASONING_KEYS = ( | ||
| "reasoning_content", | ||
| "thinking_blocks", | ||
| "reasoningContentBlocks", | ||
| ) | ||
|
|
||
|
|
||
| def _redact_provider_specific_fields(psf, redacted_str: str = "redacted-by-litellm"): | ||
| """Scrub reasoning-content duplicates inside Message.provider_specific_fields.""" | ||
| if not isinstance(psf, dict): | ||
| return | ||
| for key in _PROVIDER_SPECIFIC_REASONING_KEYS: | ||
| if key not in psf: | ||
| continue | ||
| if key == "reasoning_content": | ||
| psf[key] = redacted_str | ||
| else: | ||
| psf[key] = None |
There was a problem hiding this comment.
Hardcoded key list won't cover new provider PSF fields
_PROVIDER_SPECIFIC_REASONING_KEYS names exactly three keys. If Anthropic or Bedrock later populate additional provider_specific_fields entries that carry raw reasoning or input content (e.g., a citations blob or future reasoning formats), those keys will pass through unredacted with no code change to catch them. Consider documenting in the module docstring that PSF redaction is intentionally key-allowlist-based so future provider additions know to update this tuple.
mateo-berri
left a comment
There was a problem hiding this comment.
Are the greptile P2's worth addressing?
PR overviewThis pull request updates LiteLLM logging redaction behavior so provider-specific fields and request-body snapshots are suppressed when message logging is disabled. The touched redaction path also affects how proxy request bodies are prepared for spend logging. Most of the reported redaction gaps have been addressed, with 3 issues already fixed. One open issue remains where the spend-log payload path can still persist sensitive request-body fields because it does not apply the same request-body redaction logic. The remaining exposure is limited to data written into spend logs, but it can still be triggered by an authenticated caller supplying sensitive content in affected request fields. Open issues (1)
Fixed/addressed: 3 · PR risk: 5/10 |
…CR keys in body snapshot
…message logging is off
…itellm_fix_redaction_psf_and_body_snapshots # Conflicts: # litellm/litellm_core_utils/redact_messages.py # tests/test_litellm/litellm_core_utils/test_redact_messages.py
| if not isinstance(body, dict): | ||
| return | ||
|
|
||
| _redact_request_body_dict(body) |
There was a problem hiding this comment.
Medium: Request body fields still leak to spend logs
_get_proxy_server_request_for_spend_logs_payload calls perform_redaction(model_call_details=_request_body, result=None) with the request body itself, not with a nested litellm_params.proxy_server_request.body. That means an authenticated caller can put sensitive input in contents, query, documents, document, or system-prompt fields and have it written to spend logs despite message logging redaction being enabled; reuse _redact_request_body_dict in that spend-log path or make perform_redaction apply the same body-key redaction when it is called with a raw request body.
Relevant issues
Related to closed issues #16336 (the proxy
bodysnapshot leaking the prompt) and #15822 (the residualprovider_specific_fieldsgap that survived the original flat-field fix). Both describe the bug class addressed here.Linear ticket
Pre-Submission checklist
tests/test_litellm/directory — 8 new tests onTestPerformRedactionintests/test_litellm/litellm_core_utils/test_redact_messages.py.make test-unit— focused suite (pytest tests/test_litellm/litellm_core_utils/test_redact_messages.py) passes 27/27; fullmake test-unitnot yet run locally, relying on CI.perform_redaction. The independent self-referencing snapshot bug surfaced during the same investigation is going up as a separate PR.@greptileaiand received a Confidence Score of at least 4/5.CI (LiteLLM team)
Screenshots / Proof of Fix
Verified end-to-end against a minimal
CustomLoggerregistered as asuccess_callbackon a local proxy. Three request shapes were inspected before and after the patch: no redaction header, redaction header + non-reasoning chat completion, redaction header + Anthropic extended-thinking. The relevant kwargs paths captured by the logger:Before (redaction header set, Anthropic extended-thinking request):
```
kwargs.litellm_params.proxy_server_request.body.messages[0].content
-> 'CANARY_INPUT_should_be_redacted'
response_obj.choices[0].message.provider_specific_fields.thinking_blocks[0].thinking
-> ''
kwargs.additional_args.complete_input_dict.messages[0].content[0].text
-> 'CANARY_INPUT_should_be_redacted'
```
After:
```
kwargs.litellm_params.proxy_server_request.body.messages
-> [{"role": "user", "content": "redacted-by-litellm"}]
response_obj.choices[0].message.provider_specific_fields.thinking_blocks
-> null
kwargs.additional_args.complete_input_dict.messages
-> [{"role": "user", "content": "redacted-by-litellm"}]
```
Ground-truth canary-search across the full success-event kwargs returns zero hits for either input or reasoning canaries on the patched build.
Type
🐛 Bug Fix
Changes
`perform_redaction` was scrubbing only the top-level `messages` / `prompt` / `input` keys on `model_call_details` and the `standard_logging_object` payload when message logging was disabled. Three additional surfaces on the same custom-logger kwargs dict still carried the user prompt and reasoning content:
Fix is three small additive helpers in `litellm/litellm_core_utils/redact_messages.py`:
All three helpers are guarded on dict/type checks and only overwrite keys when present — no effect on the non-redaction path, no signature changes, no refactors elsewhere. Backwards-compatible.
Known follow-ups (out of scope here, intentionally narrowed):