Skip to content

Parallel tool calls fail with Gemini 3 via Responses API: "Missing corresponding tool call for tool response message" #22578

Description

@jschomay

Bug Description

Parallel tool calls fail when using the Responses API (/v1/responses) with Gemini 3 models (e.g. gemini-3-pro-preview) in streaming mode. The second and subsequent tool results raise:

litellm.APIConnectionError: Missing corresponding tool call for tool response message.

Steps to Reproduce

  1. Send a request to /v1/responses with stream: true, a Gemini 3 model, and tools defined
  2. The model returns 2+ parallel tool calls (e.g. tool_a and tool_b)
  3. Execute the tools and send results back as function_call_output input items with the call_id values from the response

Expected Behavior

All tool results are matched to their corresponding tool calls and the conversation continues.

Actual Behavior

litellm.APIConnectionError: Missing corresponding tool call for tool response message.
Received - message={'role': 'tool', 'tool_call_id': 'call_AAA', 'content': '...'},
last_message_with_tool_calls={'role': 'assistant', 'tool_calls': [{'id': 'call_BBB', ...}]}

The first tool result (call_AAA) can't find its tool call because last_message_with_tool_calls only contains the second tool call (call_BBB).

Root Cause

The Responses API transformation layer (litellm/responses/litellm_completion_transformation/transformation.py) creates one separate assistant message per function_call input item, each containing a single tool call:

assistant msg 1: tool_calls = [{id: "call_AAA", name: "tool_a"}]
assistant msg 2: tool_calls = [{id: "call_BBB", name: "tool_b"}]
tool msg 1: tool_call_id = "call_AAA"
tool msg 2: tool_call_id = "call_BBB"

The Gemini format converter in litellm/llms/vertex_ai/gemini/transformation.py merges consecutive assistant messages into a single model turn, but on this line:

last_message_with_tool_calls = assistant_msg

...it overwrites on each iteration, so only the last assistant message's tool_calls survive. When convert_to_gemini_tool_call_result then tries to match tool results via exact ID comparison, the first tool result can't find its ID in the last assistant message.

Suggested Fix

Accumulate tool_calls from all consecutive assistant messages instead of overwriting:

_tool_calls = assistant_msg.get("tool_calls") or []
if _tool_calls:
    if last_message_with_tool_calls is None:
        last_message_with_tool_calls = {"tool_calls": list(_tool_calls)}
    else:
        last_message_with_tool_calls["tool_calls"].extend(_tool_calls)

Environment

  • LiteLLM version: 1.82.0 (also reproduced on 1.80.x)
  • Model: gemini/gemini-3-pro-preview
  • API endpoint: /v1/responses with stream: true
  • Using tool calling with 2+ parallel function calls

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions