fix(bedrock_guardrails): select latest user message by original role in apply_guardrail by michelligabriele · Pull Request #30482 · BerriAI/litellm

michelligabriele · 2026-06-15T20:03:44Z

Relevant issues

Supersedes #25355 — that PR guarded at content-build time, but the role information is already lost upstream at message-selection time, so it did not stop the leak. This fixes the selection itself.

Linear ticket

Pre-Submission checklist

I have added meaningful tests
My PR passes all unit tests on make test-unit — ran the affected enterprise guardrail suite locally (14 passed); full make test-unit deferred to CI
My PR's scope is as isolated as possible; it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Screenshots / Proof of Fix

With experimental_use_latest_role_message_only enabled, the unified apply_guardrail path scanned the latest message of any role as the Bedrock INPUT, rather than the latest user-role message. In tool-calling conversations ending in a tool/assistant message, that non-user content was sent to the ApplyGuardrail INPUT scan.

The new regression tests reproduce this on the pre-fix code and pass with the fix.

Before the fix (conversation ending in a tool message — the INPUT scan receives the tool result):

>       assert kwargs["messages"] == [data["messages"][1]]   # expected the user message
E       AssertionError: At index 0 diff:
E         {'role': 'user', 'content': 'TOOL secret output'}        # what was actually scanned
E         != {'role': 'user', 'content': 'my SSN is 123-45-6789'}  # the real latest user message

After the fix — the latest original-role user message is the only content scanned, and masked content is written back to that message's position only (system/assistant/tool untouched):

tests/enterprise/litellm_enterprise/proxy/guardrails/test_bedrock_apply_guardrail.py
..............                                                            [100%]
14 passed

Type

🐛 Bug Fix
✅ Test

Changes

apply_guardrail now selects the latest message whose original role is user from inputs["structured_messages"] (falling back to request_data["messages"]), instead of wrapping every flattened text as role="user" and taking the latest of any role.
Skips the INPUT scan entirely when there is no user-role message or the latest user message has no text content.
Writes masked content back to the correct slice of the flat texts list, keeping it aligned with the translation handler's positional message↔text mapping (no whole-list clobber when only a subset is scanned).
Tests: latest-user selection on a tool-ending conversation; skip when no user message; masked write-back to the correct position; and an end-to-end test through the real OpenAIChatCompletionsHandler.process_input_messages (only the network call mocked) asserting the mask lands on the user message and other roles are left unchanged.

…in apply_guardrail (#23476)

…ndler (#23476)

codecov · 2026-06-15T20:07:52Z

Codecov Report

❌ Patch coverage is 28.81356% with 42 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...y/guardrails/guardrail_hooks/bedrock_guardrails.py	28.81%	42 Missing ⚠️

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-06-15T20:10:55Z

Greptile Summary

This PR fixes a message-selection bug in BedrockGuardrail.apply_guardrail: when experimental_use_latest_role_message_only is enabled, agentic conversations ending in a tool or assistant message were leaking that non-user content to the Bedrock INPUT scan because the old code wrapped every flat text as a role="user" mock and then picked the "latest user message."

Introduces _select_messages_for_apply_guardrail which resolves role from inputs["structured_messages"] (or falls back to request_data["messages"]), finds the last true user-role message, and skips the scan entirely when none exists or when the user message has no text content.
Adds _locate_message_texts_slice to map the selected message back to its positional slice in the flat texts list, so masked content is written to the right indices and other roles are left untouched.
Five regression tests (all mocked) cover the tool-ending conversation, no-user-message skip, positional write-back, and an end-to-end handler integration path.

Confidence Score: 4/5

Safe to merge for the core bug fix; the new role-selection and slice write-back logic is well-tested, the fallback paths preserve pre-existing behaviour, and no real network calls are made in tests.

The fix is tightly scoped and the five new regression tests cover all the main scenarios. Two minor gaps exist: the write-back guard is not exhaustive for the edge case where scanned_role_subset=True, scanned_slice=None, and lengths happen to equal; and the fallback to request_data messages can silently skip masking when skip-message flags are active for direct API callers. Neither is a regression introduced by this PR, but they are worth closing before the feature is promoted beyond experimental.

bedrock_guardrails.py — specifically the write-back block after line 1846 and the structured_messages fallback at line 455.

Important Files Changed

Filename	Overview
litellm/proxy/guardrails/guardrail_hooks/bedrock_guardrails.py	Adds `_select_messages_for_apply_guardrail` and `_locate_message_texts_slice` helpers to fix role-based message selection in `apply_guardrail`; includes one gap in the write-back guard logic for an unlikely edge case.
tests/enterprise/litellm_enterprise/proxy/guardrails/test_bedrock_apply_guardrail.py	Adds 5 new regression tests covering tool-ending conversations, no-user-message skip, positional write-back, and an end-to-end handler integration test; all mocked correctly with no real network calls.

_{Reviews (1): Last reviewed commit: "test(bedrock_guardrails): cover masking ..." | Re-trigger Greptile}

greptile-apps · 2026-06-15T20:10:59Z

+            elif scanned_role_subset and len(masked_texts) != len(texts):
+                # Scanned a role-selected subset but could not map it back to
+                # flat-text positions — keep the original texts rather than
+                # misapply masked content to the wrong message.
+                verbose_proxy_logger.warning(


Unguarded write-back when scanned_role_subset=True, scanned_slice=None, and lengths happen to match

The guard at line 1858 catches the case where a role-subset scan produced a differently-sized masked_texts list. However, there is no branch for when scanned_role_subset=True, scanned_slice=None (i.e. _locate_message_texts_slice failed), and len(masked_texts) == len(texts). In that state the single masked text returned by Bedrock would silently propagate as the sole element of inputs["texts"], clobbering everything downstream in the handler's positional write-back.

In practice this requires len(texts) == 1 simultaneously with a total != len(texts) mismatch in _locate_message_texts_slice, which is a very narrow window. Replacing the length check with a blanket scanned_role_subset and scanned_slice is None guard would close the gap entirely.

greptile-apps · 2026-06-15T20:11:00Z

+        structured_messages = cast(
+            Optional[List[AllMessageValues]],
+            inputs.get("structured_messages") or request_data.get("messages"),
+        )
+        if input_type != "request" or not structured_messages:
+            # No role information available (e.g. raw-text callers like
+            # /guardrails/apply_guardrail) — keep the legacy behavior of
+            # scanning the latest text only.
+            filter_result = self._prepare_guardrail_messages_for_role(
+                messages=mock_messages
+            )
+            return ApplyGuardrailMessageSelection(
+                filtered_messages=filter_result.payload_messages or mock_messages,
+                scanned_slice=None,
+                scanned_role_subset=False,
+            )
+
+        latest_user_index = self._find_latest_message_index(
+            structured_messages, target_role="user"
+        )
+        if latest_user_index is None:
+            verbose_proxy_logger.debug(
+                "Bedrock Guardrail: no user-role message in request, skipping INPUT scan"
+            )
+            return ApplyGuardrailMessageSelection(None, None, True, skip_scan=True)
+


Fallback to request_data["messages"] may desync structured_messages from texts when skip flags are active

When inputs["structured_messages"] is absent (e.g., a direct /guardrails/apply_guardrail caller), the code falls back to request_data["messages"] — the unfiltered message list. If skip_system_message_in_guardrail or skip_tool_message_in_guardrail is set, texts will have fewer entries than the unfiltered message list, causing _locate_message_texts_slice to detect total != len(texts) and return None. The code then falls into the warning-and-skip branch, which discards the masking.

This is not a regression, but it means the fix silently does nothing for that caller+flag combination. A comment at the fallback site would help future maintainers understand why request_data["messages"] is only safe here when no skip flags are active.

Sameerlite · 2026-06-16T02:56:52Z

@michelligabriele can you fix the greptile comments?

michelligabriele added 2 commits June 15, 2026 21:49

fix(bedrock_guardrails): select latest user message by original role …

b294ec5

…in apply_guardrail (#23476)

test(bedrock_guardrails): cover masking write-back through unified ha…

40cf9f3

…ndler (#23476)

greptile-apps Bot reviewed Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(bedrock_guardrails): select latest user message by original role in apply_guardrail#30482

fix(bedrock_guardrails): select latest user message by original role in apply_guardrail#30482
michelligabriele wants to merge 2 commits into
litellm_internal_stagingfrom
litellm_fix_bedrock_guardrail_role_leak_v2

michelligabriele commented Jun 15, 2026

Uh oh!

codecov Bot commented Jun 15, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 15, 2026

Important Files Changed

Uh oh!

greptile-apps Bot Jun 15, 2026

Uh oh!

greptile-apps Bot Jun 15, 2026

Uh oh!

Sameerlite commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

michelligabriele commented Jun 15, 2026

Relevant issues

Linear ticket

Pre-Submission checklist

Screenshots / Proof of Fix

Type

Changes

Uh oh!

codecov Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps Bot commented Jun 15, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

greptile-apps Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Sameerlite commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented Jun 15, 2026 •

edited

Loading