Skip to content

feat(llm): support multi-provider disableThinking strategies#228

Open
jackson-jia-914 wants to merge 1 commit into
TencentCloud:mainfrom
jackson-jia-914:feat/disable-thinking
Open

feat(llm): support multi-provider disableThinking strategies#228
jackson-jia-914 wants to merge 1 commit into
TencentCloud:mainfrom
jackson-jia-914:feat/disable-thinking

Conversation

@jackson-jia-914

Copy link
Copy Markdown

Description

Add disableThinking option to suppress reasoning in extraction models, covering 7 mainstream inference engines and model providers.

Supported Strategies

Strategy Injected Field Provider
"vllm" chat_template_kwargs: { enable_thinking: false } vLLM / SGLang (self-hosted)
"deepseek" enable_thinking: false (top-level) DeepSeek official API
"dashscope" enable_thinking: false (top-level) Alibaba DashScope / Qwen
"openai" reasoning_effort: "low" OpenAI o-series (cannot fully disable)
"anthropic" thinking: { type: "disabled" } Anthropic Claude
"kimi" thinking: { type: "disabled" } Kimi / Moonshot
"gemini" thinking_config: { thinking_budget: 0 } Google Gemini

Key Design Decisions

  • Backward compatible: true is equivalent to "vllm", false / unset = no injection
  • Zero overhead: when strategy === false, returns globalThis.fetch directly with no wrapper
  • Clean dispatch: injection logic extracted to src/utils/no-think-fetch.ts with a STRATEGY_TRANSFORMERS map; TypeScript ensures compile-time coverage of all strategies
  • Safe filtering: only modifies requests containing a messages array (chat completions); embedding and other non-chat requests pass through unchanged
  • Performance: both StandaloneLLMRunner and LocalLlmClient create and cache the fetch wrapper in the constructor, avoiding per-call recreation

Covered Paths

  • OpenClaw plugin + Gateway: llm.disableThinking, wired through parseConfigTdaiCoreStandaloneLLMRunnerFactory; Gateway also supports TDAI_LLM_DISABLE_THINKING env var (accepts strategy names or booleans)
  • Offload local mode (L1/L1.5/L2): separate offload.disableThinking config

Error Handling

  • Unknown strategy name → console.warn + fallback to false (no injection)

Change Type

  • Bug fix
  • New feature
  • Documentation update
  • Code optimization

Self-test Checklist

  • Verified locally
  • No existing features affected

Additional Notes

Test Coverage

  • 20 vitest unit tests covering all 7 strategy injections, false passthrough, embedding skip, non-JSON tolerance, strategy validation, and normalization
  • Container E2E verification: triggered real L1 extraction with kimi-k2.6, qwen3.7-max (dashscope), and deepseek-v4-pro models, confirming correct parameter injection and valid extraction output

…tion models

Support multiple inference engines and model providers via a strategy
enum so each receives its own thinking-disabling field in
chat-completion request bodies.

Strategies:
- "vllm"      → chat_template_kwargs: { enable_thinking: false }  (vLLM / SGLang)
- "deepseek"  → enable_thinking: false  (DeepSeek official API)
- "dashscope" → enable_thinking: false  (Alibaba DashScope / Qwen)
- "openai"    → reasoning_effort: "low"  (OpenAI o-series, cannot fully disable)
- "anthropic" → thinking: { type: "disabled" }  (Anthropic Claude)
- "kimi"      → thinking: { type: "disabled" }  (Kimi / Moonshot)
- "gemini"    → thinking_config: { thinking_budget: 0 }  (Google Gemini)

Defaults to false (no wrapper). Also accepts boolean true as a shorthand
for "vllm" for convenience.

Covered paths:
- StandaloneLLMRunner (OpenClaw plugin + gateway): llm.disableThinking,
  wired through parseConfig, tdai-core, seed-runtime, and gateway config
  (TDAI_LLM_DISABLE_THINKING env also accepts strategy names).
- Offload local mode (L1/L1.5/L2): separate offload.disableThinking.

The fetch wrapper lives in src/utils/no-think-fetch.ts with a
STRATEGY_TRANSFORMERS map for clean dispatch. StandaloneLLMRunner builds
it once in the constructor; LocalLlmClient caches it at construction
time and passes it through to callLlm().

Add vitest unit tests for all 7 strategies, normalization, validation,
embedding skip, and non-JSON tolerance (20 tests total).

Signed-off-by: jackson.jia <jiazhenghua0@gmail.com>
@Maxwell-Code07

Copy link
Copy Markdown
Collaborator

Thanks for the contribution! Supporting disableThinking across 7 inference engines is a useful enhancement. We'll evaluate it internally and get back to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants