fix(mcp): log + respond to unsupported server→client requests instead of dropping silently#28121
Conversation
… of dropping silently When an upstream MCP server sends sampling/createMessage or elicitation/create to the LiteLLM MCP Gateway, the request used to disappear from the operator's point of view: the MCP Python SDK auto-replied to the server with a generic INVALID_REQUEST error, but LiteLLM logged nothing, so there was no signal that an upstream server had attempted a capability we don't implement. Add a _LiteLLMMCPClientSession subclass that overrides _received_request for these two methods and: * logs a WARNING that names the upstream server and the method, so the event is visible in proxy logs (no more silent drop at the LiteLLM layer); * responds with a JSON-RPC METHOD_NOT_FOUND (-32601) error whose message identifies LiteLLM as the responding peer and links the tracking issue. The override is deliberately done in _received_request rather than via the SDK's sampling_callback / elicitation_callback hooks, because the SDK's identity check on those callbacks would otherwise cause it to declare the 'sampling' and 'elicitation' capabilities to the upstream server during initialize — which we don't actually implement yet. Existing experimental_mcp_client tests are updated to patch the new symbol; six new tests cover the unsupported-method path (logging + error response, stdio vs http server URL, fall-through to the SDK default for unrelated server→client requests, and a guard against accidentally declaring the unsupported capabilities). Refs #23761
Greptile SummaryThis PR upgrades the LiteLLM MCP Gateway's handling of unsupported server→client methods (
Confidence Score: 5/5Safe to merge — the change is narrowly scoped to an unimplemented code path and cannot regress existing tool-call or transport behaviour. The production change adds a new subclass that only activates when an upstream server sends sampling/createMessage or elicitation/create, two capabilities that LiteLLM never advertises and that previously triggered no LiteLLM-layer logging at all. The WARNING log has been confirmed to contain only server URL, method, and error code — no user-supplied content. The respond(ErrorData) call is valid per the SDK's generic contract. Existing tests are updated faithfully rather than weakened, and the six new tests directly exercise the changed code paths including a dedicated PII-non-leakage regression guard. No files require special attention.
|
| Filename | Overview |
|---|---|
| litellm/experimental_mcp_client/client.py | Adds _LiteLLMMCPClientSession subclass and _build_unsupported_method_error helper; warning log is correctly scoped to server URL + method only, with params gated at DEBUG level; respond(ErrorData) usage is valid per SDK contract. |
| tests/test_litellm/experimental_mcp_client/test_mcp_client.py | Existing tests correctly updated to patch the new subclass; six new tests cover warning/error correctness, PII non-leakage, fallthrough, and capability declaration regression; one minor fragility from importing private SDK symbols in the capability test. |
Reviews (2): Last reviewed commit: "fix(mcp): keep sampling params out of WA..." | Re-trigger Greptile
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
🤖 litellm-agent: This PR is currently BLOCKED from merge. Score: 2/5 ❌ Why blocked:
Details: Score docked for: 1 PR-related CI failure (Greptile gate: score 3/5 below required 4/5 — request a Greptile review ( Fix the issues above and push an update — the bot will re-review automatically.
|
The previous WARNING emitted by _LiteLLMMCPClientSession._received_request included the raw 'params' object via %r. For sampling/createMessage that payload can contain the full conversation history and system prompt, and for elicitation/create it can contain server-supplied instructions — both of which should not unconditionally land in production log aggregators. Split the log path: * WARNING: server URL + method + response code only (operator-visible). * DEBUG: full params payload, gated on isEnabledFor(DEBUG). Add a regression test asserting WARNING records do not contain sampling content even when the request carries a sensitive snippet. Addresses Greptile review on PR #28121.
|
@greptileai please re-review. Addressed the security concern in the latest commit (53c9ec3):
All 20 tests in |
|
Closing in favour of #27109, which is attempting the full implementation of |
Relevant issues
Refs #23761.
Problem
Per the MCP spec, when an upstream MCP server sends
sampling/createMessageorelicitation/createto a client that doesn't implement those capabilities, the client must respond with a JSON-RPC error so the server can fall back gracefully. Today, when an upstream server sends one of these requests through the LiteLLM MCP Gateway, the operator sees nothing: the MCP Python SDK auto-replies to the server with a genericINVALID_REQUESTerror, but LiteLLM logs nothing about the event, so it looks like a silent drop from the proxy's point of view.This is the smallest, defensive fix for the bug described in the issue title. It is intentionally scoped narrower than the full Mode A/B implementation discussed in the issue body (#27109 is in flight for that and may go a different way); the goal here is just to make the current behaviour observable and spec-compliant in a way that won't conflict with whatever the full implementation ends up looking like.
Change
litellm/experimental_mcp_client/client.py:_LiteLLMMCPClientSession, a thinmcp.ClientSessionsubclass that overrides_received_requestforCreateMessageRequest/ElicitRequest:WARNINGthat names the upstream server URL and the method (so the event is now visible in proxy logs);METHOD_NOT_FOUND(-32601), with a message that identifies LiteLLM as the responding peer and points at the tracking issue;ping,roots/list, task callbacks, …).MCPClient._execute_session_operationin place of the bareClientSession.tests/test_litellm/experimental_mcp_client/test_mcp_client.py:@patch("...ClientSession")references to patch the new subclass symbol (existing behaviour unchanged).TestUnsupportedServerToClientMethodscovering:LiteLLM MCP Gateway, the method name, and the upstream server URL;server_url(stdio) renders asstdioin the error message;sampling/createMessagetriggers a WARNING log and aMETHOD_NOT_FOUNDErrorDataresponse;elicitation/createdoes the same for stdio servers;super()._received_request;initialize).Approach (Option A, smaller scope)
The override is in
_received_requestrather than via the SDK'ssampling_callback/elicitation_callbackhooks because the SDK's identity check on those callbacks (callback is not _default_*_callback) is also what gates capability declaration duringinitialize. Installing custom callbacks would make the SDK advertisesamplingandelicitationsupport to upstream servers — which we explicitly don't implement yet, and which would actively encourage servers to send more of these requests.Why not the full Mode A/B implementation?
The issue describes a phased approach (sampling in Mode B → bidirectional relay in Mode A → elicitation in Mode B). That's a significantly larger change that touches the proxy server, the manager, and the gateway's session model. #27109 is already in flight along those lines but currently has merge conflicts and is awaiting maintainer direction.
This PR doesn't preclude any of those designs — it just upgrades the immediate behaviour from "silent at the LiteLLM layer" to "observable warning + spec-compliant JSON-RPC error", in a way that's independently useful and easy to extend later (the
_LiteLLMMCPClientSessionhook point is exactly where a real handler would slot in).Test plan
pytest tests/test_litellm/experimental_mcp_client/— 28 passed (13 existing + 6 new + 9 intest_tools.py).CLAUDE.md.pytest tests/test_litellm/proxy/_experimental/mcp_server/run: 690 passed; 1 unrelated failure intest_semantic_filter_basic_filteringdue to the optionalsemantic_routerdep not being installed in my local venv — not touched by this PR.