fix(core): enforce adjacent tool results by he-yufeng · Pull Request #4622 · QwenLM/qwen-code

he-yufeng · 2026-05-29T08:34:17Z

Summary

make cleanOrphanedToolCalls() keep only tool responses from the contiguous tool block after the assistant message that declared the tool call
remove assistant tool_calls whose results were separated by another user/assistant turn
add a regression test for a history-repair sequence where call_b exists globally but is no longer adjacent to its assistant turn

Tests

npm run test --workspace=packages/core -- converter.test.ts
npm run typecheck --workspace=packages/core
npx eslint packages/core/src/core/openaiContentGenerator/converter.ts packages/core/src/core/openaiContentGenerator/converter.test.ts --max-warnings 0
npm run build --workspace=packages/core
git diff --check

wenshao · 2026-05-29T16:50:26Z

+          !('tool_call_id' in nextMessage) ||
+          !nextMessage.tool_call_id
+        ) {
+          break;


[Critical] The break here triggers on any non-tool message, including consecutive assistant messages. cleanOrphanedToolCalls runs at line 385 before mergeConsecutiveAssistantMessages at line 387, so consecutive model turns in Gemini history (common after compaction, rewind, or partial-push recovery — see geminiChat.ts:793-799) produce consecutive assistant messages that break this scan immediately.

Concrete trace:

[assistant(tool_calls=[A])] [assistant(content="...")] ← role !== 'tool' → break. A gets zero adjacent responses. [tool(A)] ← silently dropped

The same break also fires when splitToolMedia is enabled (default for mimo provider): the converter's own media user message lands between tool messages from different content items, breaking the scan and dropping valid parallel tool calls.

This is a regression from the old global-set matching. Fix options:

Swap the pipeline order: run mergeConsecutiveAssistantMessages before cleanOrphanedToolCalls.

Skip (don't break on) non-tool-call assistant messages and converter-generated media messages in the inner scan.

— qwen3.7-max via Qwen Code /review

wenshao · 2026-05-29T16:50:26Z

+        const nextMessage = messages[nextIndex];
+        if (
+          nextMessage.role !== 'tool' ||
+          !('tool_call_id' in nextMessage) ||


[Suggestion] !nextMessage.tool_call_id is truthy for empty string ''. createToolMessage (lines 776/784) sets tool_call_id: response.id || '', so when response.id is falsy, the tool message gets an empty-string ID. This causes the scan to break here, dropping all subsequent valid tool responses in the contiguous block.

Consider separating the missing-property check from the empty-string check:

Suggested change

!('tool_call_id' in nextMessage) ||

if (nextMessage.role !== 'tool' || !('tool_call_id' in nextMessage)) {

break;

}

if (!nextMessage.tool_call_id) {

continue;

}

— qwen3.7-max via Qwen Code /review

wenshao · 2026-05-29T16:50:26Z

@@ -1394,44 +1394,61 @@ function cleanOrphanedToolCalls(
  messages: OpenAI.Chat.ChatCompletionMessageParam[],


[Suggestion] The function drops tool calls and responses silently — no logging at all. The file already uses createDebugLogger('CONVERTER') elsewhere. Consider adding a debug log when calls or responses are dropped, to aid production debugging of orphan-related issues:

if (droppedIds.size > 0) { debugLog('Dropped non-adjacent tool calls: %o', [...droppedIds]); }

— qwen3.7-max via Qwen Code /review

wenshao · 2026-05-29T16:50:26Z

+        }
+
+        if (toolCallIds.has(nextMessage.tool_call_id)) {
+          adjacentToolResponseIds.add(nextMessage.tool_call_id);


[Suggestion] The loop intentionally continues past tool messages whose tool_call_id doesn't match the current assistant's calls (no else { break; }). This is correct — it tolerates orphaned tool responses from other calls within a contiguous block. However, this behavior is undocumented. A future maintainer who reads this loop and adds else { break; } would silently break parallel tool calls in histories where compaction left orphaned responses in the middle of a block.

Consider adding a comment:

// Non-matching tool responses are skipped (they may belong to other // assistants' calls); only non-tool messages terminate the scan.

— qwen3.7-max via Qwen Code /review

wenshao · 2026-05-29T16:50:26Z

      });
    });

+    it('should drop tool responses that are not adjacent to their assistant tool call', () => {


[Suggestion] The test covers partial trimming well, but several branches of the new code are untested:

Content-fallback branch: assistant with text content + all-orphaned tool_calls → kept as text-only (lines 1459-1470 in converter.ts)

Back-to-back assistant messages: consecutive model contents with tool_calls — directly related to the Critical adjacency-break bug

splitToolMedia interleaving: parallel tool responses across multiple content items with media

Adding test cases for these would help catch the Critical bug and lock in the intended behavior.

— qwen3.7-max via Qwen Code /review

wenshao · 2026-05-30T02:50:42Z

+        }
      } else if (
        typeof message.content === 'string' &&
        message.content.trim()


[Suggestion] Test gap: the content-fallback branch silently drops assistant messages that have only tool_calls (no text content, or content: '' from reasoning-only thinking-model turns) when all tool_calls are orphaned. The existing 5 new tests cover partial trimming (test 1), content-fallback with text (test 2), empty-id responses (test 3), merged consecutive assistants (test 4), and split media adjacency (test 5). None covers this silent-drop scenario.

Consider adding a test that constructs a model turn with only functionCall parts (no text), where none of the calls have adjacent responses. Assert the assistant message is absent from the output and no tool messages are emitted.

it('should drop assistant with only tool_calls when all are orphaned', () => { const request: GenerateContentParameters = { model: 'models/test', contents: [ { role: 'model', parts: [ { functionCall: { id: 'call_orphan', name: 'read_file', args: {} } }, ], }, { role: 'user', parts: [{ text: 'break adjacency' }] }, { role: 'user', parts: [{ text: 'continue' }] }, ], }; const messages = converter.convertGeminiRequestToOpenAI(request, requestContext); expect(messages.some((m) => m.role === 'assistant')).toBe(false); expect(messages.some((m) => m.role === 'tool')).toBe(false); });

— qwen3.7-max via Qwen Code /review

wenshao · 2026-05-30T02:50:42Z

+  const adjacentToolResponseIdsByAssistant = new Map<number, Set<string>>();
+  const validToolResponseIndexesByAssistant = new Map<number, number[]>();
+  const splitMediaIndexesByAssistant = new Map<number, number[]>();
+  const validToolResponseIndexes = new Set<number>();


[Suggestion] validToolResponseIndexes (Set) and validToolResponseIndexesByAssistant (Map) appear to be unnecessary. Any tool response added to validToolResponseIndexes during pass 1 was matched against some assistant's toolCallIds, so that assistant has validToolCalls.length > 0 and emits the response inline via emittedWithAssistant. Since assistants always precede their responses in the array, the assistant is processed first in the main loop — the response's index is already in emittedWithAssistant when the loop reaches it.

The if (validToolResponseIndexes.has(index)) true-branch at line 1538 is therefore unreachable. The else branch (drop with debug log) IS reachable for orphaned responses.

Suggested simplification: Remove both data structures. In the main loop's tool-response branch, just drop all tool responses that reach it (valid ones were already emitted with their assistant):

Suggested change

const validToolResponseIndexes = new Set<number>();

} else if (message.role === 'tool' && 'tool_call_id' in message) {

debugLogger.debug(

`cleanOrphanedToolCalls: dropping orphaned tool response ${message.tool_call_id || '<empty>'}`,

);

— qwen3.7-max via Qwen Code /review

wenshao · 2026-05-30T02:50:42Z

+          continue;
+        }
+
+        if (nextMessage.role === 'assistant' && !hasToolCalls(nextMessage)) {


[Suggestion] The continue past text-only assistants assumes consecutive assistants have already been merged by mergeConsecutiveAssistantMessages. Without that precondition, the scan would walk past an intermediate text-only assistant and attribute tool responses on the other side to the wrong declaring assistant.

Currently safe because merge runs first at line 384. But there's no assertion, JSDoc, or comment documenting this dependency. If this function is ever called from a new code path that doesn't pre-merge (e.g., a non-Gemini converter, a daemon replay path), the scan silently misattributes responses.

Consider adding a JSDoc @precondition note:

Suggested change

if (nextMessage.role === 'assistant' && !hasToolCalls(nextMessage)) {

/**

* Clean up orphaned tool calls from message history to prevent OpenAI API errors.

*

* @precondition Consecutive assistant messages must be merged first

* (via mergeConsecutiveAssistantMessages). Without this, the adjacency

* scan may attribute tool responses to the wrong declaring assistant.

*/

— qwen3.7-max via Qwen Code /review

he-yufeng · 2026-05-30T07:07:19Z

Updated the branch to address the latest review callouts.

Changes:

added coverage for the all-orphaned, tool-call-only assistant case
removed the unreachable valid-tool-response fallback branch so orphaned tool responses are consistently logged/dropped once they reach the cleanup pass
documented the cleanup precondition around pre-merged consecutive assistant turns

Validation:

npm run test --workspace=packages/core -- converter.test.ts
npm run typecheck --workspace=packages/core
npx eslint packages/core/src/core/openaiContentGenerator/converter.ts packages/core/src/core/openaiContentGenerator/converter.test.ts --max-warnings 0
npx prettier --check packages/core/src/core/openaiContentGenerator/converter.ts packages/core/src/core/openaiContentGenerator/converter.test.ts
npm run build --workspace=packages/core
git diff --check

wenshao · 2026-05-30T07:14:40Z

Verification Report — PR #4622

Commit: eddd7a26d (fix(core): enforce adjacent tool results)
Base: main
Tester: wenshao
Date: 2026-05-30

Test Results

Check	Result	Details
Unit tests (`converter.test.ts`)	✅ PASS	6 files, 208 tests passed
TypeScript (`tsc --noEmit`)	✅ PASS	0 errors
ESLint (`--max-warnings 0`)	✅ PASS	0 warnings, 0 errors
Build (`packages/core`)	✅ PASS	Successfully copied files
`git diff --check`	✅ PASS	No whitespace errors

Test File Breakdown

File	Tests	Time
`openaiContentGenerator/converter.test.ts`	108	24ms
`anthropicContentGenerator/converter.test.ts`	50	11ms
`claude-converter.test.ts`	19	397ms
`schemaConverter.test.ts`	12	4ms
`toml-to-markdown-converter.test.ts`	10	5ms
`gemini-converter.test.ts`	9	4ms

Verdict

✅ All 5 checks from the PR test plan pass. No regressions detected. Safe to merge.

wenshao · 2026-05-30T07:31:39Z

@@ -1390,152 +1390,164 @@ function mapGeminiFinishReasonToOpenAI(
 /**
 * Clean up orphaned tool calls from message history to prevent OpenAI API errors.
 */


[Suggestion] The JSDoc /** Clean up orphaned tool calls from message history to prevent OpenAI API errors. */ was originally above cleanOrphanedToolCalls, but when hasToolCalls() and isSplitToolMediaMessage() were extracted, the JSDoc got stranded above hasToolCalls. Now hasToolCalls carries documentation describing a completely different function, while cleanOrphanedToolCalls (line 1424) only has a terse line comment.

Suggested change

*/

/** Type guard: is this an assistant message with at least one tool call? */

function hasToolCalls(

Then move the original JSDoc to above cleanOrphanedToolCalls:

/** * Clean up orphaned tool calls from message history to prevent OpenAI API errors. * Assumes consecutive assistant messages have already been merged. */ function cleanOrphanedToolCalls(

— qwen3.7-max via Qwen Code /review

wenshao · 2026-05-30T07:31:39Z

+
+        if (isSplitToolMediaMessage(nextMessage)) {
+          splitMediaIndexes.push(nextIndex);
+          continue;


[Suggestion] The forward scan unconditionally collects every isSplitToolMediaMessage into splitMediaIndexes, regardless of which tool response the split media belongs to. If an orphaned tool response with images precedes a valid tool response in the same scan window, the orphaned response's images are emitted alongside the scanning assistant's valid tool calls — the model sees images from a tool it never called.

Concrete scenario with splitToolMedia: true:

[0] assistant: tool_calls [call_a] [1] tool: call_x (orphaned — no assistant declared call_x) [2] user: SPLIT_TOOL_MEDIA (images stripped from call_x's response) [3] tool: call_a (valid response for call_a)

The scan from [0] records [2] in splitMediaIndexes and [3] in toolResponseIndexes. Output: assistant(call_a), tool(call_a), SPLIT_TOOL_MEDIA(images from call_x).

Consider tracking which tool response each split media follows, and only emitting media for surviving tool calls. Or break the scan on split media that follows a non-matching tool response.

— qwen3.7-max via Qwen Code /review

wenshao · 2026-05-30T07:31:39Z

-        ).tool_calls = finalValidToolCalls;
-        finalCleaned.push(cleanedMessage);
+        }
      } else if (


[Suggestion] The text-content fallback typeof message.content === 'string' && message.content.trim() doesn't check reasoning_content. An assistant with content: '' and reasoning_content: '...' (produced by reasoning models like DeepSeek/QwQ when issuing tool calls with chain-of-thought) is silently dropped — no debug log is emitted either.

This loses chain-of-thought reasoning from the conversation history. Consider broadening the check:

Suggested change

} else if (

} else if (

(typeof message.content === 'string' &&

message.content.trim()) ||

(message as ExtendedChatCompletionAssistantMessageParam)

.reasoning_content

) {

Also consider adding a debug log for the implicit drop when neither condition is met:

} else { debugLogger.debug( `cleanOrphanedToolCalls: dropping assistant with ${message.tool_calls.length} orphaned tool call(s) and no text/reasoning content`, ); }

— qwen3.7-max via Qwen Code /review

he-yufeng · 2026-05-30T07:35:38Z

One CI follow-up: the Windows test job failed in packages/cli at src/ui/components/InputPrompt.test.tsx:375 (expected onSubmit('commit this'), 0 calls). That test is outside the touched converter path; the focused core tests, typecheck, lint, format check, and build all passed locally, and ubuntu/macos CI are green. I tried to rerun the failed job, but GitHub reports rerun requires repository admin rights.

he-yufeng · 2026-05-30T08:36:14Z

Thanks for the verification pass. The latest branch includes the requested follow-ups, but GitHub still shows the older changes-requested review state. Could you re-review or clear the stale review when convenient? The only remaining red CI item appears to be the unrelated Windows InputPrompt test noted above; I do not have permission to rerun that job.

wenshao · 2026-05-30T09:01:20Z

  processContents(request.contents, messages, requestContext);

-  // Clean up orphaned tool calls and merge consecutive assistant messages
+  messages = mergeConsecutiveAssistantMessages(messages);


[Suggestion] cleanOrphanedToolCalls can re-introduce consecutive assistant messages in the output after mergeConsecutiveAssistantMessages already eliminated them. When an orphaned tool response sits between two assistant messages, dropping it creates adjacent assistants:

Input: assistant(call_a), tool(a), assistant("text"), tool(orphan), assistant(call_b), tool(b) After merge: unchanged (no consecutive assistants) After clean: tool(orphan) dropped → assistant(call_a), tool(a), assistant("text"), assistant(call_b), tool(b) → 3 consecutive assistant messages

This is reachable whenever history compaction or rewind leaves an orphaned tool response between model turns. Some providers reject consecutive assistant messages, and downstream code that assumes no consecutive assistants after conversion will be surprised.

Suggested change

messages = mergeConsecutiveAssistantMessages(messages);

messages = mergeConsecutiveAssistantMessages(messages);

if (options.cleanOrphanToolCalls) {

messages = cleanOrphanedToolCalls(messages);

messages = mergeConsecutiveAssistantMessages(messages);

}

— qwen3.7-max via Qwen Code /review

he-yufeng · 2026-05-30T12:35:02Z

Pushed 3a96af9 with the review follow-up. It now re-merges assistant turns after orphan cleanup, keeps split media only when it follows a surviving adjacent tool response, preserves reasoning-only assistant content when orphaned tool calls are removed, and moves the cleanup docs back onto the right helper. I also closed duplicate #4638 so #4622 remains the single review thread for #4619. Validation: npm run test --workspace=packages/core -- converter.test.ts; npm run typecheck --workspace=packages/core; npx eslint on converter.ts and converter.test.ts; npm run build --workspace=packages/core; git diff --check.

BZ-D

LGTM. 这个 PR 改动量大但逻辑正确——从全局 ID 匹配改为严格邻接检测，scan-forward 算法处理了各种边界情况（orphaned call、split media、reasoning content、merge 后的 consecutive assistant），测试覆盖非常充分。

…ig (QwenLM#4622)

wenshao requested changes May 29, 2026

View reviewed changes

he-yufeng force-pushed the fix/tool-result-adjacency-cleanup branch from ae1ef9d to 8f5d58a Compare May 29, 2026 18:28

github-actions Bot mentioned this pull request May 30, 2026

📊 AI CLI 工具社区动态日报 2026-05-30 zx0828/big_model_radar#77

Open

wenshao reviewed May 30, 2026

View reviewed changes

fix(core): enforce adjacent tool results

eddd7a2

he-yufeng force-pushed the fix/tool-result-adjacency-cleanup branch from 8f5d58a to eddd7a2 Compare May 30, 2026 07:06

wenshao reviewed May 30, 2026

View reviewed changes

he-yufeng mentioned this pull request May 30, 2026

fix(core): enforce tool result adjacency #4638

Closed

fix(core): handle orphan cleanup edge cases

3a96af9

wenshao approved these changes May 30, 2026

View reviewed changes

This was referenced May 30, 2026

📊 AI CLI 工具社区动态日报 2026-05-30 jasonalang/big_model_radar#1

Open

📊 AI CLI 工具社区动态日报 2026-05-30 jasonalang/big_model_radar#6

Open

BZ-D approved these changes May 31, 2026

View reviewed changes

wenshao merged commit 9dafd60 into QwenLM:main May 31, 2026
10 checks passed

xaelistic pushed a commit to xaelistic/qwen-code that referenced this pull request Jun 7, 2026

Refactor the logic for deciding whether to launch a browser into conf…

235aae4

…ig (QwenLM#4622)

github-actions Bot mentioned this pull request Jun 8, 2026

Upstream changes detected: ACP spec, Gemini CLI, Codex CLI (App Server Protocol), GitHub Copilot CLI, Qwen Code greenpolo/cc-multi-cli-plugin#7

Open

		@@ -1394,44 +1394,61 @@ function cleanOrphanedToolCalls(
		messages: OpenAI.Chat.ChatCompletionMessageParam[],

-  const validToolResponseIndexes = new Set<number>();
+    } else if (message.role === 'tool' && 'tool_call_id' in message) {
+      debugLogger.debug(
+        `cleanOrphanedToolCalls: dropping orphaned tool response ${message.tool_call_id || '<empty>'}`,
+      );

-        if (nextMessage.role === 'assistant' && !hasToolCalls(nextMessage)) {
+/**
+ * Clean up orphaned tool calls from message history to prevent OpenAI API errors.
+ *
+ * @precondition Consecutive assistant messages must be merged first
+ *   (via mergeConsecutiveAssistantMessages). Without this, the adjacency
+ *   scan may attribute tool responses to the wrong declaring assistant.
+ */

	*/
	/** Type guard: is this an assistant message with at least one tool call? */
	function hasToolCalls(

-      } else if (
+      } else if (
+        (typeof message.content === 'string' &&
+          message.content.trim()) ||
+        (message as ExtendedChatCompletionAssistantMessageParam)
+          .reasoning_content
+      ) {

Conversation

he-yufeng commented May 29, 2026

Summary

Tests

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

he-yufeng commented May 30, 2026

Uh oh!

wenshao commented May 30, 2026

Verification Report — PR #4622

Test Results

Test File Breakdown

Verdict

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

he-yufeng commented May 30, 2026

Uh oh!

he-yufeng commented May 30, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

he-yufeng commented May 30, 2026

Uh oh!

BZ-D left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants