Skip to content

fix(mcp): log + respond to unsupported server→client requests instead of dropping silently#28121

Closed
abdulmunimjemal wants to merge 2 commits into
BerriAI:litellm_internal_stagingfrom
abdulmunimjemal:fix/mcp-elicitation-sampling-no-silent-drop
Closed

fix(mcp): log + respond to unsupported server→client requests instead of dropping silently#28121
abdulmunimjemal wants to merge 2 commits into
BerriAI:litellm_internal_stagingfrom
abdulmunimjemal:fix/mcp-elicitation-sampling-no-silent-drop

Conversation

@abdulmunimjemal

Copy link
Copy Markdown

Relevant issues

Refs #23761.

Problem

Per the MCP spec, when an upstream MCP server sends sampling/createMessage or elicitation/create to a client that doesn't implement those capabilities, the client must respond with a JSON-RPC error so the server can fall back gracefully. Today, when an upstream server sends one of these requests through the LiteLLM MCP Gateway, the operator sees nothing: the MCP Python SDK auto-replies to the server with a generic INVALID_REQUEST error, but LiteLLM logs nothing about the event, so it looks like a silent drop from the proxy's point of view.

This is the smallest, defensive fix for the bug described in the issue title. It is intentionally scoped narrower than the full Mode A/B implementation discussed in the issue body (#27109 is in flight for that and may go a different way); the goal here is just to make the current behaviour observable and spec-compliant in a way that won't conflict with whatever the full implementation ends up looking like.

Change

litellm/experimental_mcp_client/client.py:

  • Add _LiteLLMMCPClientSession, a thin mcp.ClientSession subclass that overrides _received_request for CreateMessageRequest / ElicitRequest:
    • logs a WARNING that names the upstream server URL and the method (so the event is now visible in proxy logs);
    • responds with a JSON-RPC error using METHOD_NOT_FOUND (-32601), with a message that identifies LiteLLM as the responding peer and points at the tracking issue;
    • falls through to the SDK default for every other server→client request (ping, roots/list, task callbacks, …).
  • Use the subclass from MCPClient._execute_session_operation in place of the bare ClientSession.

tests/test_litellm/experimental_mcp_client/test_mcp_client.py:

  • Update existing @patch("...ClientSession") references to patch the new subclass symbol (existing behaviour unchanged).
  • Add TestUnsupportedServerToClientMethods covering:
    • the error message includes LiteLLM MCP Gateway, the method name, and the upstream server URL;
    • empty server_url (stdio) renders as stdio in the error message;
    • sampling/createMessage triggers a WARNING log and a METHOD_NOT_FOUND ErrorData response;
    • elicitation/create does the same for stdio servers;
    • unrelated server→client requests fall through to super()._received_request;
    • a regression guard that the subclass doesn't accidentally install custom sampling/elicitation callbacks (doing so would make the SDK declare these capabilities during initialize).

Approach (Option A, smaller scope)

The override is in _received_request rather than via the SDK's sampling_callback / elicitation_callback hooks because the SDK's identity check on those callbacks (callback is not _default_*_callback) is also what gates capability declaration during initialize. Installing custom callbacks would make the SDK advertise sampling and elicitation support to upstream servers — which we explicitly don't implement yet, and which would actively encourage servers to send more of these requests.

Why not the full Mode A/B implementation?

The issue describes a phased approach (sampling in Mode B → bidirectional relay in Mode A → elicitation in Mode B). That's a significantly larger change that touches the proxy server, the manager, and the gateway's session model. #27109 is already in flight along those lines but currently has merge conflicts and is awaiting maintainer direction.

This PR doesn't preclude any of those designs — it just upgrades the immediate behaviour from "silent at the LiteLLM layer" to "observable warning + spec-compliant JSON-RPC error", in a way that's independently useful and easy to extend later (the _LiteLLMMCPClientSession hook point is exactly where a real handler would slot in).

Test plan

  • pytest tests/test_litellm/experimental_mcp_client/ — 28 passed (13 existing + 6 new + 9 in test_tools.py).
  • Black formatting applied per CLAUDE.md.
  • Wider pytest tests/test_litellm/proxy/_experimental/mcp_server/ run: 690 passed; 1 unrelated failure in test_semantic_filter_basic_filtering due to the optional semantic_router dep not being installed in my local venv — not touched by this PR.

… of dropping silently

When an upstream MCP server sends sampling/createMessage or elicitation/create
to the LiteLLM MCP Gateway, the request used to disappear from the operator's
point of view: the MCP Python SDK auto-replied to the server with a generic
INVALID_REQUEST error, but LiteLLM logged nothing, so there was no signal that
an upstream server had attempted a capability we don't implement.

Add a _LiteLLMMCPClientSession subclass that overrides _received_request for
these two methods and:

* logs a WARNING that names the upstream server and the method, so the event
  is visible in proxy logs (no more silent drop at the LiteLLM layer);
* responds with a JSON-RPC METHOD_NOT_FOUND (-32601) error whose message
  identifies LiteLLM as the responding peer and links the tracking issue.

The override is deliberately done in _received_request rather than via the
SDK's sampling_callback / elicitation_callback hooks, because the SDK's
identity check on those callbacks would otherwise cause it to declare the
'sampling' and 'elicitation' capabilities to the upstream server during
initialize — which we don't actually implement yet.

Existing experimental_mcp_client tests are updated to patch the new symbol;
six new tests cover the unsupported-method path (logging + error response,
stdio vs http server URL, fall-through to the SDK default for unrelated
server→client requests, and a guard against accidentally declaring the
unsupported capabilities).

Refs #23761
@CLAassistant

CLAassistant commented May 17, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@greptile-apps

greptile-apps Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR upgrades the LiteLLM MCP Gateway's handling of unsupported server→client methods (sampling/createMessage, elicitation/create) from a silent pass-through to an observable, spec-compliant response: a METHOD_NOT_FOUND JSON-RPC error is returned and a WARNING is logged with server URL and method, while the full params payload stays at DEBUG level so PII never reaches production aggregators by default.

  • Introduces _LiteLLMMCPClientSession, a thin ClientSession subclass that overrides the private _received_request method, and _build_unsupported_method_error for constructing descriptive ErrorData responses; sampling/elicitation capabilities are deliberately not declared during initialize because the override bypasses the SDK's callback identity check.
  • Adds six new tests covering error content, stdio fallback labelling, WARNING-level PII non-leakage, fallthrough for unhandled request types, and a capability-declaration regression guard; existing tests are updated to patch the new subclass symbol rather than the bare ClientSession.

Confidence Score: 5/5

Safe to merge — the change is narrowly scoped to an unimplemented code path and cannot regress existing tool-call or transport behaviour.

The production change adds a new subclass that only activates when an upstream server sends sampling/createMessage or elicitation/create, two capabilities that LiteLLM never advertises and that previously triggered no LiteLLM-layer logging at all. The WARNING log has been confirmed to contain only server URL, method, and error code — no user-supplied content. The respond(ErrorData) call is valid per the SDK's generic contract. Existing tests are updated faithfully rather than weakened, and the six new tests directly exercise the changed code paths including a dedicated PII-non-leakage regression guard.

No files require special attention.

Important Files Changed

Filename Overview
litellm/experimental_mcp_client/client.py Adds _LiteLLMMCPClientSession subclass and _build_unsupported_method_error helper; warning log is correctly scoped to server URL + method only, with params gated at DEBUG level; respond(ErrorData) usage is valid per SDK contract.
tests/test_litellm/experimental_mcp_client/test_mcp_client.py Existing tests correctly updated to patch the new subclass; six new tests cover warning/error correctness, PII non-leakage, fallthrough, and capability declaration regression; one minor fragility from importing private SDK symbols in the capability test.

Reviews (2): Last reviewed commit: "fix(mcp): keep sampling params out of WA..." | Re-trigger Greptile

Comment thread litellm/experimental_mcp_client/client.py
Comment thread litellm/experimental_mcp_client/client.py
@codecov

codecov Bot commented May 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.00000% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
litellm/experimental_mcp_client/client.py 92.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@oss-pr-review-agent-shin

Copy link
Copy Markdown
Contributor

🤖 litellm-agent: This PR is currently BLOCKED from merge.

Score: 2/5

Why blocked:

  • 1 PR-related CI failure (Greptile gate: score 3/5 below required 4/5 — request a Greptile review (@greptileai) and resolve its comments before maintainer review.) (pr_related_failures, -2 pts)
  • Greptile 3/5 (greptile_low, -1 pts)

Details: Score docked for: 1 PR-related CI failure (Greptile gate: score 3/5 below required 4/5 — request a Greptile review (@greptileai) and resolve its comments before maintainer review.); Greptile 3/5.

Fix the issues above and push an update — the bot will re-review automatically.

Note: This bot is still in beta and might not always work as expected. Please share any feedback via Slack.

The previous WARNING emitted by _LiteLLMMCPClientSession._received_request
included the raw 'params' object via %r. For sampling/createMessage that
payload can contain the full conversation history and system prompt, and
for elicitation/create it can contain server-supplied instructions —
both of which should not unconditionally land in production log
aggregators.

Split the log path:
  * WARNING: server URL + method + response code only (operator-visible).
  * DEBUG: full params payload, gated on isEnabledFor(DEBUG).

Add a regression test asserting WARNING records do not contain
sampling content even when the request carries a sensitive snippet.

Addresses Greptile review on PR #28121.
@abdulmunimjemal

Copy link
Copy Markdown
Author

@greptileai please re-review.

Addressed the security concern in the latest commit (53c9ec3):

  • WARNING-level log now contains only server URL + method + response code (no params).
  • DEBUG-level log carries the full params payload, gated on verbose_logger.isEnabledFor(logging.DEBUG) so it never lands in production log aggregators by default.
  • Added a regression test (test_sampling_warning_log_does_not_contain_message_content) that injects a sensitive snippet into a sampling/createMessage request and asserts no WARNING record contains it.

All 20 tests in tests/test_litellm/experimental_mcp_client/test_mcp_client.py pass locally. uv run black . applied.

@abdulmunimjemal

Copy link
Copy Markdown
Author

Closing in favour of #27109, which is attempting the full implementation of sampling/createMessage + elicitation/create handlers. This PR was a smaller defensive fix (log + JSON-RPC METHOD_NOT_FOUND for the unsupported methods) — happy to revive if maintainers want a smaller scope, but no point keeping two competing PRs on the same issue. Thanks for the Greptile review and CLA bot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants