fix(cli,core): harden OOM prevention — idempotent compaction tests, explicit GC, debug log defaults#4914
Conversation
Cover the scenario fixed in commit 5957010 where already-compacted tool groups (resultDisplay === UI_COMPACT_CLEARED_MESSAGE) were incorrectly counted as having real output, causing over-compaction. Three new test cases: - Already-compacted groups are not re-compacted; second call is a no-op - All tool groups already compacted → no-op - Mixed tool group (some tools real, some cleared) → only groups with real output are compacted
- enableExplicitGC defaults to true, --expose-gc added to start/dev scripts - isDebugLogFileEnabled() defaults to false (opt-in via QWEN_DEBUG_LOG_FILE=1) - Add safety tests: trigger_gc only in critical tier, global.gc() only in memoryPressureMonitor.ts trigger_gc case
wenshao
left a comment
There was a problem hiding this comment.
[Suggestion] packages/core/src/config/config.ts:935-937 — QWEN_MEMORY_ENABLE_GC=1 env var check is now redundant since DEFAULT_PRESSURE_CONFIG.enableExplicitGC defaults to true. No =0 opt-out path exists. Either add QWEN_MEMORY_ENABLE_GC=0 handling or remove the redundant code.
[Suggestion] packages/core/src/services/memoryPressureMonitor.ts:494-506 — Silent warning chain: when global.gc is unavailable in production, the trigger_gc warn log is swallowed because isDebugLogFileEnabled() now defaults to false. The GC unavailability is invisible to users. Consider using console.warn or stderr for this diagnostic so it is not gated on debug log file settings.
— DeepSeek/deepseek-v4-pro via Qwen Code /review
- Replace brittle source-parsing test with behavioral tests for global.gc() - Export UI_COMPACT_CLEARED_MESSAGE constant and use in tests - Remove redundant NODE_OPTIONS override from start script - Add production bin wrapper with --expose-gc for OOM protection - Remove unused path import from memoryPressureMonitor.test.ts Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com>
|
Thanks for the review! @wenshao All R1 comments addressed:
|
wenshao
left a comment
There was a problem hiding this comment.
[Suggestion] --expose-gc does not reach all deployment modes. Two execution paths were not updated:
scripts/create-standalone-package.js(lines 494, 503) — standalone package shims (Unix + Windows) launchnode cli.jswithout--expose-gcAcpBridge.ts:61andhttpAcpBridge.ts:4165— daemon-spawned sessions use[cliEntry, '--acp']without forwardingprocess.execArgv(contrast withrelaunch.ts:44which correctly spreads...process.execArgv)
With enableExplicitGC: true, these paths hit the "global.gc is not available" fallback at critical pressure — GC won't actually run, undermining OOM prevention for standalone and daemon deployments.
— qwen3.7-plus via Qwen Code /review
DragonnZhang
left a comment
There was a problem hiding this comment.
Reviewed the PR-specific changes (OOM prevention: idempotency tests, explicit GC, debug log defaults). The existing review comments from @wenshao cover the significant issues — signal forwarding in cli-entry.js and the --expose-gc coverage gap. The R1 fixes (behavioral GC tests, UI_COMPACT_CLEARED_MESSAGE import) addressed prior feedback well. No additional findings. — qwen-code-reviewer via Qwen Code /review
Standalone package shims and daemon-spawned sessions (AcpBridge, httpAcpBridge) were missing --expose-gc, causing explicit GC to silently fail under critical memory pressure. Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com>
Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com>
|
Thanks for the review! @wenshao R2 comments addressed:
|
|
Pushed a new commit. |
DragonnZhang
left a comment
There was a problem hiding this comment.
Review Summary
Reviewed PR #4914 which hardens OOM prevention through idempotent compaction tests, explicit GC enablement, and debug log defaults.
Changes Overview
- Added regression tests for
compactOldItemsidempotency (3 test cases) - Enabled explicit GC by default (
enableExplicitGC: true) with--expose-gcin startup scripts - Disabled debug logging by default (opt-in via
QWEN_DEBUG_LOG_FILE=1) - Added production bin wrapper (
scripts/cli-entry.js) with--expose-gc - Forwarded
--expose-gcto standalone packages and daemon-spawned sessions
Findings
1 Critical issue found:
scripts/cli-entry.jsis not marked as executable (mode 100644) despite being a bin entry point with shebang
Assessment
The PR addresses the OOM prevention gaps identified in prior reviews. The implementation is sound:
- Test coverage is comprehensive and behavioral (not brittle source-parsing)
- Signal forwarding in
cli-entry.jscorrectly handles child process termination - The
--expose-gcflag forwarding is now consistent across all deployment modes - Removing the redundant
QWEN_MEMORY_ENABLE_GCenv var check simplifies the config
The executable bit issue on cli-entry.js should be fixed before merge to ensure proper bin behavior across all installation methods.
Verdict: Request Changes (1 critical finding)
Local runtime verification report (head 74b690e)Verdict: FAIL — one deployment-channel gap; everything else verified working at runtime. Every behavior this PR claims was exercised in real tmux-driven sessions and works: Environment / method
Steps (running app, captured output)
The gap, precisely
Other observations (non-blocking)
Repro (forced critical tier)QWEN_DEBUG_LOG_FILE=1 QWEN_CODE_NO_RELAUNCH=true \
NODE_OPTIONS=--max-old-space-size=256 \
QWEN_MEMORY_PRESSURE_SOFT=0.3 QWEN_MEMORY_PRESSURE_HARD=0.31 QWEN_MEMORY_PRESSURE_CRITICAL=0.32 \
npm start -- -y -p "Use the read_file tool to read package.json, reply with only the version."
# then: grep -E 'trigger_gc|global.gc' ~/.qwen/debug/$(readlink ~/.qwen/debug/latest)中文摘要结论:FAIL — 仅一个发布渠道缺口,其余全部真机验证通过。
|
…MORY_ENABLE_GC=0 opt-out
|
Thanks for the review! @wenshao All R3 comments addressed:
|
|
Thanks for the review! @DragonnZhang R4 comments addressed:
|
…m-V2.5 # Conflicts: # packages/cli/src/serve/httpAcpBridge.ts
…nnel - Add --expose-gc to getAcpMemoryArgs() so daemon-spawned ACP children have global.gc() available for critical memory pressure cleanup - Filter --inspect/-brk flags from process.execArgv to prevent port conflicts in multi-session daemon mode - Update spawnChannel.test.ts for new getAcpMemoryArgs() return shape This change was previously in httpAcpBridge.ts but lost during the daemon refactor merge (QwenLM#4490) that moved spawn logic to acp-bridge.
70aec2c to
7a1a5bf
Compare
DragonnZhang
left a comment
There was a problem hiding this comment.
Reviewed the full diff (33 files, +1216/-435). The three substantive changes are clean and well-tested:
-
Idempotent compaction tests — Three focused regression tests that cover already-compacted groups, all-compacted, and mixed groups. Exporting
UI_COMPACT_CLEARED_MESSAGEis the minimum surface change needed. -
Explicit GC by default —
enableExplicitGC: trueis correct; the env-var semantics flip from opt-in to opt-out cleanly.--expose-gcpropagation throughstart.js,dev.js,cli-entry.js,spawnChannel.ts, andAcpBridge.tsis consistent. The inspect-flag filter onprocess.execArgvcorrectly prevents debug-port conflicts in child processes. The new GC safety tests (tier guard + call-site guard) provide good regression coverage. -
Debug log default off — Straightforward inversion with sensible addition of empty string to the deny list.
No bugs, security issues, or logic errors found. The bulk of the line count is Prettier reformatting and markdown table alignment, which is noise but harmless.
|
@DragonnZhang Thank you for the thorough review and approval! 🙏 |
|
I've just merged the latest upstream changes from |
1c2a79e to
ddf7147
Compare
|
Thanks for the detailed reviews and approval! @wenshao @DragonnZhang I've resolved the merge conflicts with |
|
@qwen-code /triage |
|
Thanks for the PR! Solid follow-up to the OOM work in #4824 — these are practical hardening changes. Template is mostly complete ✓ — has "What this PR does", "Why it's needed", and a Reviewer Test Plan with verification steps. Missing the "Risk & Scope" and "Tested on" sections from the template, which would be useful here given that On direction: Aligned. The three changes directly address the OOM bug from #4815. Regression tests for the compaction idempotency fix are important — that kind of guard is exactly what prevents silent reintroduction. Enabling explicit GC at critical pressure is a reasonable safety net (zero cost when not triggered, and the tier guards are well-tested). Disabling debug log by default is a sensible I/O reduction for normal usage. On approach: The core logic is tight, but the PR is noisier than it needs to be. About 10 of the 32 changed files are pure Prettier reformatting (markdown tables in One thing worth thinking about: Not blockers — moving on to code review. 🔍 中文说明感谢贡献!这是 #4824 OOM 修复的扎实后续工作。 模板基本完整 ✓,有 "What this PR does"、"Why it's needed" 和 Reviewer Test Plan。缺少 "Risk & Scope" 和 "Tested on" 部分,考虑到 方向: 对齐。三项改动直接针对 #4815 的 OOM 问题。压缩幂等性回归测试很重要,能防止将来无意中重新引入。critical 压力下启用显式 GC 是合理的安全网(未触发时零开销,压力层级守护测试覆盖充分)。debug 日志默认关闭减少正常使用的 I/O。 方案: 核心逻辑紧凑,但 PR 包含约 10 个纯 Prettier 格式化文件( 非阻塞项,进入代码审查 🔍 — Qwen Code · qwen3.7-max |
Code ReviewThe core logic changes are clean — no correctness bugs or security issues found. Substantive changes (all sound):
Formatting-only changes (~10 files): As flagged in Stage 1 — One maintenance concern: Real-Scenario TestingBefore (installed qwen 0.18.0)Debug log enabled by default — session file created (16,908 bytes): After (PR branch via
|
| Test file | Result |
|---|---|
useHistoryManager.test.ts |
22/22 passed ✅ |
memoryPressureMonitor.test.ts |
72/72 passed ✅ |
config.test.ts |
211/211 passed ✅ |
spawnChannel.test.ts |
20/20 passed ✅ |
loggers.test.ts |
50/50 passed ✅ |
中文说明
代码审查
核心逻辑改动干净,无正确性 bug 或安全问题。
实质改动(均合理):
memoryPressureMonitor.ts— 单行翻转enableExplicitGC。config.ts—QWEN_MEMORY_ENABLE_GC从 opt-in 改为 opt-out。debugLogger.ts— 未设置环境变量时默认返回false。cli-entry.js— 使用spawnSync注入--expose-gc,信号和退出码处理正确。spawnChannel.ts/AcpBridge.ts— 传递process.execArgv(过滤--inspect*)。
维护隐患: cli-entry.js 内容在 scripts/cli-entry.js 和 scripts/prepare-package.js 中各存一份,存在漂移风险。
格式化改动(~10 个文件): 如 Stage 1 所述,建议拆分。
实际测试
- Before(已安装 qwen 0.18.0): 正常运行,debug 日志默认创建(16,908 字节)。
- After(PR 分支
npm run dev): 正常运行,debug 日志默认未创建。行为变更已验证。 - 单元测试: 5 个测试套件共 375 个测试全部通过。
— Qwen Code · qwen3.7-max
|
This PR does what it says on the tin — three focused hardening changes that directly address the OOM pain from #4815, backed by solid regression tests. The before/after testing confirms the debug log default change works, and all 375 unit tests pass across the five affected suites. My independent proposal for this problem would have been nearly identical to the PR's core changes. The one thing the PR does that I wouldn't have thought of is the The two concerns from earlier stages are real but not blockers: the ~10 formatting-only files dilute the diff and add git blame noise (worth splitting next time, not worth blocking this PR over), and the Approving. ✅ 中文说明这个 PR 实现了它所承诺的——三项针对 #4815 OOM 问题的专注加固改动,配合扎实的回归测试。Before/after 测试验证了 debug 日志默认关闭的行为变更,5 个测试套件共 375 个测试全部通过。 我独立提出的方案与 PR 的核心改动几乎一致。PR 比我多想一步的是 两个非阻塞顾虑:~10 个纯格式化文件增加了 diff 噪音(建议下次拆分); 批准。✅ — Qwen Code · qwen3.7-max |
qwen-code-ci-bot
left a comment
There was a problem hiding this comment.
LGTM, looks ready to ship. ✅
|
@wenshao Thanks for the reviews! |
🔁 Re-verification — head
|
| Parent daemon launched as | Spawned ACP child cmdline | Result |
|---|---|---|
node … serve (no --expose-gc) |
node --max-old-space-size=16384 --expose-gc … --acp |
✅ child gets --expose-gc anyway (unconditional getAcpMemoryArgs); no --inspect |
node --inspect=127.0.0.1:9559 --expose-gc … serve |
node --expose-gc --max-old-space-size=16384 --expose-gc … --acp |
✅ --inspect filtered (no debugger-port collision); --expose-gc kept |
spawnChannel.test.ts: 20/20. Mutation check: deleting'--expose-gc'fromgetAcpMemoryArgs()→always includes --expose-gc …fails — the test is load-bearing.- Minor (cosmetic, non-blocking): when the parent does have
--expose-gc, the child carries it twice (once via forwardedexecArgv, once via the unconditional memory args). Harmless duplicate; could dedupe.
2. Original FAIL (06-10) stays fixed — npm-publish artifact
Ran npm run bundle && npm run prepare:package on this head and inspected the published dist/ layout:
dist/cli-entry.jspresent;dist/package.json→bin: { qwen: "cli-entry.js" };filesincludescli-entry.js.- The generated dist wrapper resolves
join(__dirname, 'cli.js')— the correct dist-root path (exactly the tweak I flagged; the repo-root wrapper still uses../dist/cli.js). - It spawns
node --expose-gc cli.js …;node dist/cli-entry.js --version→0.17.1, exit 0. (My LGTM already did the fullnpm pack → install -g → global.gc() under critical pressurerun; this confirms the merges didn't regress it.)
3. Regression tests — green & load-bearing
| Suite | Result |
|---|---|
useHistoryManager.test.ts (compaction idempotency) |
22/22 |
memoryPressureMonitor.test.ts |
72/72 |
config.test.ts "explicit GC is enabled by default" |
✅ |
spawnChannel.test.ts |
20/20 |
Mutation: dropping the !== UI_COMPACT_CLEARED_MESSAGE guard in compactOldItems → should not re-compact already-compacted tool groups (idempotent) fails. (Note: the PR's net useHistoryManager.ts change is now just export const UI_COMPACT_CLEARED_MESSAGE — the guard itself already lives in main; the PR's value here is the new regression test, which is what the mutation confirms.)
4. Status of my earlier findings
- Original FAIL (npm-publish wrapper) → ✅ fixed (above) — exactly my suggested
writeDistPackageJson-emits-wrapper fix. - OpenAI API Error: 401 Incorecct API Key provided #6 whitespace
QWEN_DEBUG_LOG_FILE→ ✅ fixed (value.trim().toLowerCase()). - pre-release: fix ci #1
config.test.tsdoesn't scrubQWEN_MEMORY_ENABLE_GC→⚠️ still open. Reproduced:QWEN_MEMORY_ENABLE_GC=0 vitest … -t "GC"→explicit GC is enabled by defaultfails (the knob is still live atconfig.ts:949; it's just not inMEMORY_PRESSURE_ENV_KEYS). Non-blocking test-hygiene nit — re-adding the key to the scrub list (+ a dedicated opt-out test) would close it. - Where is the config saved? #2 wrapper signal relay / 如何自定义密钥文件 .env可能与其他文件冲突 #3 stale
~/.qwen/debugUX hints / Are you interested in AI Terminal? #4 ~46 MiB wrapper RSS / TypeError in Authentication Selection Interface #5 Windows specifics → unchanged non-blocking observations.
5. CI
ddf714784: Lint ✅, CodeQL ✅, Test ubuntu/macOS/Windows ✅, Classify/review-config ✅.
🇨🇳 中文版(点击展开)
🔁 复核 — head ddf714784:新增的 daemon GC 转发路径已验证;维持 LGTM
第三次复核,更新我的 FAIL 报告(head 74b690ede——npm 发布产物缺口)与 LGTM(head 94cb9fc——缺口已修、全部验证)。自该 LGTM 后,分支新增了 7a1a5bf45(acp-bridge spawnChannel——转发 --expose-gc、过滤 --inspect)以及两次 main 合并。在 Linux 6.12 / Node v22.22.2 上复核当前 head(ddf714784 全新 worktree、npm ci + tsc/bundle、tmux 驱动真实 qwen serve)。
结论:维持 LGTM。 新的 daemon ACP 子进程 GC 转发路径有效(--expose-gc 投递 + --inspect 过滤),我最初的 npm 发布阻塞项仍处于已修复状态,回归覆盖具有实际约束力。此前一个不阻塞的测试卫生小问题(#1)仍未处理。
1. 新增——daemon 派生的 ACP 子进程 GC 转发(7a1a5bf45)
getAcpMemoryArgs() 现在总是追加 --expose-gc,createSpawnChannelFactory 从 process.execArgv 过滤 --inspect/-brk。这恢复了"在 #4490 daemon 重构合并中丢失"的转发——我之前验证的是旧的 httpAcpBridge 路径,这次是 qwen serve 现在使用的 acp-bridge spawnChannel 路径。实测(serve + POST /session,读子进程 /proc/<pid>/cmdline):
| 父 daemon 启动方式 | 派生的 ACP 子进程 cmdline | 结果 |
|---|---|---|
node … serve(无 --expose-gc) |
node --max-old-space-size=16384 --expose-gc … --acp |
✅ 子进程仍获得 --expose-gc(无条件 getAcpMemoryArgs);无 --inspect |
node --inspect=127.0.0.1:9559 --expose-gc … serve |
node --expose-gc --max-old-space-size=16384 --expose-gc … --acp |
✅ --inspect 被过滤(无调试端口冲突);--expose-gc 保留 |
spawnChannel.test.ts:20/20。变异检查:从getAcpMemoryArgs()删除'--expose-gc'→always includes --expose-gc …失败——测试有约束力。- 次要(外观、不阻塞):父进程本身带
--expose-gc时,子进程会带两次(一次来自转发的execArgv,一次来自无条件的 memory args)。无害重复,可去重。
2. 最初的 FAIL(06-10)仍处于已修复状态——npm 发布产物
在本 head 上跑 npm run bundle && npm run prepare:package 并检查发布的 dist/ 布局:
- 存在
dist/cli-entry.js;dist/package.json→bin: { qwen: "cli-entry.js" };files包含cli-entry.js。 - 生成的 dist wrapper 解析
join(__dirname, 'cli.js')——dist 根目录下的正确路径(正是我提示的那处调整;仓库根 wrapper 仍用../dist/cli.js)。 - 它派生
node --expose-gc cli.js …;node dist/cli-entry.js --version→0.17.1,退出 0。(我的 LGTM 已做过完整的npm pack → install -g → critical 压力下 global.gc();本次确认这些合并未导致回归。)
3. 回归测试——全绿且有约束力
| 套件 | 结果 |
|---|---|
useHistoryManager.test.ts(压缩幂等) |
22/22 |
memoryPressureMonitor.test.ts |
72/72 |
config.test.ts "explicit GC is enabled by default" |
✅ |
spawnChannel.test.ts |
20/20 |
变异:去掉 compactOldItems 里的 !== UI_COMPACT_CLEARED_MESSAGE 卫语句 → should not re-compact already-compacted tool groups (idempotent) 失败。(注意:本 PR 对 useHistoryManager.ts 的净改动现在只是 export const UI_COMPACT_CLEARED_MESSAGE——卫语句本身已在 main;PR 在此处的价值是新增的回归测试,变异正是验证了它。)
4. 我此前findings的状态
- 最初的 FAIL(npm 发布 wrapper) → ✅ 已修(见上)——正是我建议的
writeDistPackageJson生成 wrapper 的方案。 - OpenAI API Error: 401 Incorecct API Key provided #6 空白
QWEN_DEBUG_LOG_FILE→ ✅ 已修(value.trim().toLowerCase())。 - pre-release: fix ci #1
config.test.ts未清理QWEN_MEMORY_ENABLE_GC→⚠️ 仍未处理。已复现:QWEN_MEMORY_ENABLE_GC=0 vitest … -t "GC"→explicit GC is enabled by default失败(该 knob 仍在config.ts:949生效,只是不在MEMORY_PRESSURE_ENV_KEYS)。不阻塞的测试卫生小问题——把它加回清理列表(再加一个专门的 opt-out 测试)即可闭环。 - Where is the config saved? #2 wrapper 信号转发 / 如何自定义密钥文件 .env可能与其他文件冲突 #3 过时的
~/.qwen/debug提示文案 / Are you interested in AI Terminal? #4 ~46 MiB wrapper 常驻内存 / TypeError in Authentication Selection Interface #5 Windows 特性 → 维持原状,均为不阻塞的观察。
5. CI
ddf714784:Lint ✅、CodeQL ✅、Test ubuntu/macOS/Windows ✅、Classify/review-config ✅。
|
Two notes — first one's a real blocker, second is just a design thought (non-blocking). Release is broken right now. The standalone packager enforces a strict dist allowlist and the new One thing I'd love your take on (non-blocking): the OOM hardening here is clearly worth doing. I'm just not sure the bin wrapper is the cheapest way to get Opened #5154 to track this so it doesn't get lost here — no rush, purely for discussion. |
The OOM-prevention work in #4914 added a dist/cli-entry.js bin wrapper (re-spawns node --expose-gc cli.js) via prepare-package.js, but did not register it in the standalone packager's strict dist allowlist. The release job then fails with: Error: Unexpected dist asset: .../dist/cli-entry.js Add cli-entry.js to DIST_ALLOWED_ENTRIES, same fix as #5049 did for fzfWorker.js.
…xplicit GC, debug log defaults (#4914) * test(cli): add compactOldItems idempotency regression tests Cover the scenario fixed in commit 5957010 where already-compacted tool groups (resultDisplay === UI_COMPACT_CLEARED_MESSAGE) were incorrectly counted as having real output, causing over-compaction. Three new test cases: - Already-compacted groups are not re-compacted; second call is a no-op - All tool groups already compacted → no-op - Mixed tool group (some tools real, some cleared) → only groups with real output are compacted * fix(cli,core): enable explicit GC and disable debug log by default - enableExplicitGC defaults to true, --expose-gc added to start/dev scripts - isDebugLogFileEnabled() defaults to false (opt-in via QWEN_DEBUG_LOG_FILE=1) - Add safety tests: trigger_gc only in critical tier, global.gc() only in memoryPressureMonitor.ts trigger_gc case * fix: address R1 review comments for memory pressure monitor - Replace brittle source-parsing test with behavioral tests for global.gc() - Export UI_COMPACT_CLEARED_MESSAGE constant and use in tests - Remove redundant NODE_OPTIONS override from start script - Add production bin wrapper with --expose-gc for OOM protection - Remove unused path import from memoryPressureMonitor.test.ts Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com> * fix: forward --expose-gc to all deployment modes Standalone package shims and daemon-spawned sessions (AcpBridge, httpAcpBridge) were missing --expose-gc, causing explicit GC to silently fail under critical memory pressure. Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com> * fix: forward child process signal in cli-entry wrapper Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com> * fix(cli,channels): filter --inspect flags when forwarding execArgv to daemon children * fix: make cli-entry.js executable (mode 100755) * fix(core): reject whitespace-only QWEN_DEBUG_LOG_FILE and add QWEN_MEMORY_ENABLE_GC=0 opt-out * fix(scripts): include cli-entry.js wrapper in dist package for npm publish * fix(acp-bridge): forward --expose-gc and filter --inspect in spawnChannel - Add --expose-gc to getAcpMemoryArgs() so daemon-spawned ACP children have global.gc() available for critical memory pressure cleanup - Filter --inspect/-brk flags from process.execArgv to prevent port conflicts in multi-session daemon mode - Update spawnChannel.test.ts for new getAcpMemoryArgs() return shape This change was previously in httpAcpBridge.ts but lost during the daemon refactor merge (#4490) that moved spawn logic to acp-bridge. --------- Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com>
What this PR does
Closes: #4815
Follow-up to #4824
1. Regression tests for compactOldItems idempotency
PR: #4824 Commit: 5957010 fixed a counting bug where already-compacted tool groups were treated as having real output. No test covered that fix. This PR adds three tests:
resultDisplay === UI_COMPACT_CLEARED_MESSAGE) are skipped; second call is a no-op2. Enable explicit GC at critical pressure
enableExplicitGCdefaults totruenow.--expose-gcis added tostart.js,dev.js, and thenpm startscript. Two safety tests added:trigger_gconly appears in thecriticaltier (≥80%), notsoftorhardglobal.gc()is only called from thetrigger_gccase inmemoryPressureMonitor.ts— source-level guard against future misuseThis is safe because
global.gc()only reclaims unreachable objects. The cleanup steps (compact_history,clear_file_cache) run first and break references;trigger_gcruns last to immediately free old_space.3. Disable debug log file by default
isDebugLogFileEnabled()now returnsfalsewhenQWEN_DEBUG_LOG_FILEis unset. PR: #4824 memory monitoring writes a debug line every 30s, that's unnecessary I/O for normal usage. Users can opt in withQWEN_DEBUG_LOG_FILE=1.Why it's needed
!== UI_COMPACT_CLEARED_MESSAGEguard and reintroduce over-compaction.enableExplicitGC: false, V8 may not reclaim old_space fast enough aftercompact_historybreaks references, which can lead to OOM in that window.Reviewer Test Plan
Before / After
enableExplicitGCfalsetrue--expose-gcin startup scriptsstart.js,dev.js,npm starttrigger_gctier guard testcriticalglobal.gc()call-site guard testQWEN_DEBUG_LOG_FILE=1How to verify
Manual: trigger critical memory pressure → debug log should contain
global.gc().Verify debug log off by default: after startup, no new file under
~/.qwen/debug/. SetQWEN_DEBUG_LOG_FILE=1, restart, file appears.中文
Closes: #4815
改动
1. compactOldItems 幂等性回归测试
PR: #4824 commit #595701096 修了已压缩 tool group 被重复计数的 bug,但没补测试。新增三个:已压缩跳过、全已压缩跳过、混合 group 只压缩有真实输出的。
2. critical 压力下默认开启 global.gc()
enableExplicitGC改为true,--expose-gc加到start.js/dev.js/npm start。加了两个安全测试:trigger_gc只在 critical 级别出现;global.gc()只在memoryPressureMonitor.ts的 trigger_gc case 中调用。安全:
global.gc()只回收无引用对象。compact_history/clear_file_cache先断引用,trigger_gc最后立即回收 old_space。3. debug 日志默认关闭
isDebugLogFileEnabled()未设置QWEN_DEBUG_LOG_FILE时返回false。V2 的 30 秒内存监控每 30 秒写一行,没必要。用户用QWEN_DEBUG_LOG_FILE=1手动开启。验证
手动:触发 critical 内存压力,看日志里有没有
global.gc() freed N bytes。启动后~/.qwen/debug/下无新文件;设QWEN_DEBUG_LOG_FILE=1重启后有。