fix(plugins): add default timeout for before_compaction/after_compaction hooks#84153
Conversation
|
Codex review: passed. Workflow note: Future ClawSweeper reviews update this same comment in place. How this review workflow works
Summary Reproducibility: yes. from source inspection: current main has no defaults for these two void hooks, and awaited PR rating Rank-up moves:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. PR egg Rarity: 🥚 common. What is this egg doing here?
Real behavior proof Risk before merge
Maintainer options:
Next step before merge Security Review detailsBest possible solution: Land the central hook-runner default if maintainers accept 30 seconds as the lifecycle hook budget, while preserving per-hook Do we have a high-confidence way to reproduce the issue? Yes, from source inspection: current main has no defaults for these two void hooks, and awaited Is this the best way to solve the issue? Yes, the central default table is the narrowest maintainable place to bound all callers while preserving the existing per-registration override. The only open solution-fit decision is whether maintainers accept 30 seconds as the fail-open default budget. Label justifications:
What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 3bc728eaa993. |
|
@clawsweeper automerge |
|
🦞🧹
Draft PRs stay fix-only until GitHub marks them ready for review. Pause with Automerge progress:
|
jalehman
left a comment
There was a problem hiding this comment.
Reviewed locally with the maintainer review workflow. No actionable findings from me; approving.
8297a29 to
e85c9d6
Compare
76cf75d to
1a046cc
Compare
…efault timeout DEFAULT_VOID_HOOK_TIMEOUT_MS_BY_HOOK only listed agent_end, so the before_compaction and after_compaction void hooks ran fully unbounded when a plugin supplied no hook.timeoutMs. In the codex agent harness these hooks fire on the serialized notification queue, so a slow or hung handler froze processing of every later codex notification — including turn/completed — hanging the whole agent turn. Add defensive default timeout entries for both hooks, mirroring the existing agent_end pattern. The budget matches agent_end's 30s rather than the tighter modifying-hook defaults because compaction hooks can legitimately do real work (e.g. a memory flush). The runner is fail-open for void hooks, so a timed-out handler is logged and compaction proceeds.
1a046cc to
41fa5fe
Compare
|
Merged via squash.
Thanks @100yenadmin! |
…026.5.20) (#615) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.5.19` → `2026.5.20` | --- >⚠️ **Warning** > > Some dependencies could not be looked up. Check the [Dependency Dashboard](issues/567) for more information. --- ### Release Notes <details> <summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary> ### [`v2026.5.20`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#2026520) [Compare Source](openclaw/openclaw@v2026.5.19...v2026.5.20) ##### Changes - Exec approvals: remove the old `cat SKILL.md && printf ... && <skill-wrapper>` allowlist compatibility path so skill files must be loaded with the read tool and only the real skill executable is auto-allowed. - Discord: let voice sessions follow configured Discord users into voice channels, with allowed-channel checks, multi-user handoff, bounded reconciliation, and DAVE recovery preservation. ([#​84264](openclaw/openclaw#84264)) Thanks [@​fuller-stack-dev](https://github.com/fuller-stack-dev). - Discord/voice: include bounded `IDENTITY.md`, `USER.md`, and `SOUL.md` profile context in realtime voice session instructions by default, with `voice.realtime.bootstrapContextFiles: []` available to disable it. ([#​84499](openclaw/openclaw#84499)) Thanks [@​fuller-stack-dev](https://github.com/fuller-stack-dev). - Dependencies: bump the bundled Codex harness to `@openai/codex` `0.132.0` and refresh the app-server model-list docs for the new catalog. - CLI/policy: add the bundled Policy plugin for policy-backed channel conformance checks, doctor lint findings, and opt-in workspace repair. ([#​80407](openclaw/openclaw#80407)) Thanks [@​giodl73-repo](https://github.com/giodl73-repo). - Agents/config: allow `agents.list[].experimental.localModelLean` so lean local-model mode can be enabled for one configured agent instead of globally. - Providers/xAI: add device-code OAuth login so remote and headless setups can authorize xAI without a localhost browser callback. ([#​84005](openclaw/openclaw#84005)) Thanks [@​fuller-stack-dev](https://github.com/fuller-stack-dev). - Providers/OpenRouter: honor provider-level `params.provider` routing policy for OpenRouter requests, with model and agent params overriding the defaults. Thanks [@​amknight](https://github.com/amknight). ##### Fixes - CLI/tasks: include stale-running task maintenance decisions in `openclaw tasks maintenance --json` so retained and reconcile candidates explain backing-session, cron, CLI, and wedged-subagent state. ([#​84691](openclaw/openclaw#84691)) Thanks [@​efpiva](https://github.com/efpiva). - Codex app-server: keep system-prompt reports working when bootstrap hooks provide workspace files with only a path and content, so hook-supplied SOUL/IDENTITY/TOOLS/USER context still reports injected characters correctly. ([#​84736](openclaw/openclaw#84736)) Thanks [@​JARVIS-Glasses](https://github.com/JARVIS-Glasses). - Providers/MiniMax music: stop advertising `durationSeconds` control and remove prompt-injected duration hints, so `music_generate` reports MiniMax duration as an unsupported override instead of suggesting MiniMax can enforce track length. Fixes [#​84508](openclaw/openclaw#84508). Thanks [@​neeravmakwana](https://github.com/neeravmakwana). - Doctor: warn when sandbox tool policy hides configured MCP server tools before provider requests. ([#​84699](openclaw/openclaw#84699)) Thanks [@​nxmxbbd](https://github.com/nxmxbbd). - WhatsApp: update Baileys to `7.0.0-rc12`. - Build: suppress per-locale `rolldown-plugin-dts:fake-js` CommonJS dts warnings emitted while bundling the intentionally-inlined `zod/v4/locales/*.d.cts` files, so `pnpm build` output stays readable after the 0.25.1 plugin bump. Thanks [@​romneyda](https://github.com/romneyda). - CLI/nodes: route lazy plugin-registration logs to stderr for JSON-mode `openclaw nodes` commands so stdout stays parseable. ([#​84684](openclaw/openclaw#84684)) Thanks [@​TurboTheTurtle](https://github.com/TurboTheTurtle). - Approvals: route manual `/approve` decisions through the trusted approval runtime so active exec and plugin approvals no longer look unknown or expired. - Mac app: update the About settings copyright year to 2026. ([#​84385](openclaw/openclaw#84385)) Thanks [@​pejmanjohn](https://github.com/pejmanjohn). - Dependencies: update `@openclaw/fs-safe` to `0.2.7` so OpenClaw's default Python-helper-off policy keeps best-effort Node write fallbacks for private stores, secret writes, run logs, and media attachments on Linux/macOS. - Infra/secrets: restore the fail-closed contract for `tryReadSecretFileSync` so credential loaders that pass `rejectSymlink: true` (Telegram, LINE, Zalo, IRC, Nextcloud Talk tokens) refuse symlinked credential files instead of silently accepting them, and the infra-state CI shard's secret-file symlink test passes again. Thanks [@​romneyda](https://github.com/romneyda). - Browser: honor the configured image sanitization limit for screenshots and labeled snapshots so browser-captured images follow the same resize policy as other image results. ([#​84595](openclaw/openclaw#84595)) - Doctor: remove unrecognized `models.providers.*.models[*].compat.thinkingFormat` values during `doctor --fix` so stale provider model config can validate after upgrade. Fixes [#​77803](openclaw/openclaw#77803). - Doctor: warn when `openclaw.json` stores plaintext secret-bearing config fields, including model provider API keys and sensitive provider headers. ([#​84718](openclaw/openclaw#84718)) Thanks [@​lukaIvanic](https://github.com/lukaIvanic). - Status: show the configured default, session-selected model, reason, clear hint, and docs link when a session remains pinned to a model that differs from `agents.defaults.model.primary`. - WebChat: clear stale typing indicators when session change events mark the active chat run complete. - Mac app: keep local packaging signed with a stable app identity for permission testing and fix Control UI production builds under current Vite/Highlight.js exports. - macOS app: update the embedded Peekaboo bridge to 3.2.1 so OpenClaw-hosted UI automation works with current Peekaboo CLI capture flows. - Cron: deliver preferred final assistant output for successful scheduled runs when trailing plain tool warnings remain in diagnostics instead of marking the run failed. - fix(mattermost): fail closed on missing channel type \[AI]. ([#​84091](openclaw/openclaw#84091)) Thanks [@​pgondhi987](https://github.com/pgondhi987). - Recheck rebuilt system.run argv \[AI]. ([#​84090](openclaw/openclaw#84090)) Thanks [@​pgondhi987](https://github.com/pgondhi987). - CLI: keep the private QA subcommand out of exported command descriptors unless `OPENCLAW_ENABLE_PRIVATE_QA_CLI=1`, so root help and subcommand markers match runtime registration. ([#​84519](openclaw/openclaw#84519)) - CLI/cron: bound `openclaw cron show` job lookup pagination so non-advancing or unbounded `cron.list` responses fail instead of hanging the command. Fixes [#​83856](openclaw/openclaw#83856). ([#​83989](openclaw/openclaw#83989)) - Agents/messages: stop message-tool-only turns after a successful source-channel `message` send while keeping transcript mirrors under the session write lock. ([#​84289](openclaw/openclaw#84289)) - Agents: filter silent heartbeat response-tool transcript artifacts out of embedded context snapshots so later user turns are not polluted by heartbeat no-op messages. ([#​83477](openclaw/openclaw#83477)) Thanks [@​fuller-stack-dev](https://github.com/fuller-stack-dev). - Agents/OpenAI: log repeated strict tool-schema downgrade diagnostics once per provider/model/tool signature, reducing duplicate debug noise while preserving `strict=false` fallback behavior. Fixes [#​82930](openclaw/openclaw#82930). ([#​82933](openclaw/openclaw#82933)) Thanks [@​galiniliev](https://github.com/galiniliev). - Agents/code mode: spell out the `exec` tool's JavaScript/TypeScript, no Node module, and catalog-bridge constraints in model-visible schema text so agents can use enabled tools without trial-and-error. ([#​84269](openclaw/openclaw#84269)) Thanks [@​Kaspre](https://github.com/Kaspre). - Codex: give `image_generate` dynamic-tool calls a 120s default watchdog when no per-call or configured image timeout is set, so image generation no longer falls back to the generic 30s bridge timeout. ([#​84254](openclaw/openclaw#84254)) Thanks [@​moritzmmayerhofer](https://github.com/moritzmmayerhofer). - Codex: avoid duplicate dynamic tool terminal diagnostics while large diagnostic backlogs drain without blocking tool responses. ([#​82937](openclaw/openclaw#82937)) Thanks [@​galiniliev](https://github.com/galiniliev). - CLI/message: include a stable top-level `messageId` in `openclaw message --json` output when channel sends return one. ([#​84191](openclaw/openclaw#84191)) Thanks [@​100menotu001](https://github.com/100menotu001). - Cron: preserve legacy top-level array `jobs.json` stores when loading or adding scheduled jobs so old cron jobs are no longer treated as an empty store during upgrade. Fixes [#​60799](openclaw/openclaw#60799). ([#​84433](openclaw/openclaw#84433)) Thanks [@​IWhatsskill](https://github.com/IWhatsskill). - Gateway/agents: use an agent's `identity.name` in Gateway agent summaries when `agents.list[].name` is unset, so configured agent labels remain visible in clients. ([#​84355](openclaw/openclaw#84355); refs [#​57835](openclaw/openclaw#57835)) Thanks [@​luoyanglang](https://github.com/luoyanglang). - Channels/replies: keep normal `/verbose` failed-tool progress compact in message-tool replies and prevent late text-only tool output from appearing after the final answer. ([#​84303](openclaw/openclaw#84303)) Thanks [@​VACInc](https://github.com/VACInc). - Plugins/hooks: apply a default 30-second timeout to `before_compaction` and `after_compaction` hooks so a hung plugin handler no longer blocks compaction completion. ([#​84153](openclaw/openclaw#84153)) - Discord: preserve disabled presentation buttons when adapting and rendering Discord message controls. ([#​84188](openclaw/openclaw#84188)) Thanks [@​100menotu001](https://github.com/100menotu001). - Twitch: add a test-only client-manager registry reset helper so non-isolated Twitch tests can clear cached managers between cases. Fixes [#​83887](openclaw/openclaw#83887). ([#​84244](openclaw/openclaw#84244)) Thanks [@​hclsys](https://github.com/hclsys). - Cron: run main-session scheduled work on a cron-owned wake lane while preserving reply delivery context, so background cron turns no longer block human main-session chat. Fixes [#​82766](openclaw/openclaw#82766). ([#​82767](openclaw/openclaw#82767)) Thanks [@​galiniliev](https://github.com/galiniliev). - Cron: use structured embedded-run denial metadata for isolated scheduled tasks so blocked exec requests fail the job without treating ordinary assistant prose as a denial. ([#​84067](openclaw/openclaw#84067)) Thanks [@​abnershang](https://github.com/abnershang). - Cron: keep recovered tool warnings diagnostic for successful scheduled runs so final cron output is delivered instead of being replaced by a post-processing warning. ([#​84045](openclaw/openclaw#84045)) Thanks [@​abnershang](https://github.com/abnershang). - Plugins/perf: thread explicit plugin discovery results through `loadBundledCapabilityRuntimeRegistry`, `resolveBundledPluginSources`, and `listChannelCatalogEntries` so callers that already hold a discovery result skip redundant filesystem walks. Thanks [@​SebTardif](https://github.com/SebTardif). - harden update restart script creation \[AI]. ([#​84088](openclaw/openclaw#84088)) Thanks [@​pgondhi987](https://github.com/pgondhi987). - Docker: keep the bundled Codex plugin in official release image keep lists so the default OpenAI agent harness remains available after Docker pruning. Fixes [#​83613](openclaw/openclaw#83613). ([#​83626](openclaw/openclaw#83626)) Thanks [@​YuanHanzhong](https://github.com/YuanHanzhong). - CLI/channels: preserve the first line of `openclaw channels logs` output when the rolling tail window starts exactly on a line boundary, mirroring the already-fixed `readLogSlice` behavior in `src/logging/log-tail.ts`. - Control UI: treat terminal session status as authoritative over stale active-run flags so completed terminal runs stop showing abort/live UI. ([#​84057](openclaw/openclaw#84057)) - CLI: preserve embedded equals signs in inline root option values instead of truncating after the second separator. ([#​83995](openclaw/openclaw#83995)) Thanks [@​ThiagoCAltoe](https://github.com/ThiagoCAltoe). - Matrix/config: accept `messages.queue.byChannel.matrix` queue overrides and keep queue provider schema/type keys aligned for Matrix, Google Chat, and Mattermost. Thanks [@​bdjben](https://github.com/bdjben). - CLI: format `openclaw acp client` failures through the shared error formatter so object-shaped errors stay readable instead of printing `[object Object]`. Fixes [#​83904](openclaw/openclaw#83904). ([#​84080](openclaw/openclaw#84080)) - Providers/Ollama: default unknown-capabilities models to tool-capable so discovered native Ollama models can use tools when `/api/show` omits capabilities. ([#​84055](openclaw/openclaw#84055)) Thanks [@​dutifulbob](https://github.com/dutifulbob). - Installer/Windows: launch `install.ps1` onboarding as an attached child process so fresh native Windows installs do not freeze visibly at `Starting setup...` or corrupt the wizard's terminal rendering. - CLI/update: keep restart health checks working across one-version CLI/Gateway protocol skew and use the managed Gateway service Node for all follow-up commands even when the package root is unchanged, so `openclaw update` no longer silently switches the gateway to a different Node binary when multiple Node installations are present. Thanks [@​amknight](https://github.com/amknight). - CLI/gateway: include the running Gateway version in `gateway status` JSON output, preserving existing server metadata while falling back to status RPC data for read probes. Fixes [#​56222](openclaw/openclaw#56222). Thanks [@​galiniliev](https://github.com/galiniliev). - Memory/search: close local embedding providers when active-memory searches time out so pending local model loads and embedding contexts are aborted and released. ([#​83858](openclaw/openclaw#83858)) Thanks [@​brokemac79](https://github.com/brokemac79). - CLI/nodes: request pending node surface approval scopes before `openclaw nodes approve` so exec-capable node approval can use admin-scoped Gateway credentials instead of failing with `missing scope: operator.admin`. ([#​84392](openclaw/openclaw#84392)) Thanks [@​joshavant](https://github.com/joshavant). - Gateway: reject slow node event sends before outbound buffers grow unbounded and log the rejected payload diagnostic. ([#​84387](openclaw/openclaw#84387)) Thanks [@​samzong](https://github.com/samzong). - Agents: include bounded trajectory queued-writer diagnostics in `pi-trajectory-flush` timeout warnings so flush stalls show pending writes, queued bytes, and append state. Fixes [#​82961](openclaw/openclaw#82961). ([#​82962](openclaw/openclaw#82962)) Thanks [@​galiniliev](https://github.com/galiniliev). - Agents/subagents: recover stale completion announces by retrying unsupported transcript-wait wakes without transcript waiting and forcing a message-tool handoff when the requester run is already stale. Fixes [#​83699](openclaw/openclaw#83699). ([#​83700](openclaw/openclaw#83700)) Thanks [@​galiniliev](https://github.com/galiniliev). - Agents/subagents: constrain wildcard subagent target allowlists to configured agents while preserving explicitly listed compatibility targets. Fixes [#​84040](openclaw/openclaw#84040). ([#​84357](openclaw/openclaw#84357)) Thanks [@​joshavant](https://github.com/joshavant). - Providers/Anthropic: route Anthropic model refs selected with Claude CLI auth through the Claude CLI runtime so shorthand refs such as `anthropic/opus-4.7` no longer fall back to embedded Anthropic billing. Fixes [#​84222](openclaw/openclaw#84222). ([#​84374](openclaw/openclaw#84374)) Thanks [@​joshavant](https://github.com/joshavant). - Agents: honor explicit `models.providers.<id>.timeoutSeconds` values above the default idle watchdog for cloud and self-hosted providers, so long first-token waits no longer fall back at \~120s when the provider timeout is higher. ([#​83979](openclaw/openclaw#83979)) Thanks [@​yujiawei](https://github.com/yujiawei). - Agents/Codex: keep encrypted Responses reasoning replay provenance-bound so stale mirrored Codex transcripts drop invalid encrypted content before request assembly while preserving matching same-session replay. Fixes [#​83836](openclaw/openclaw#83836). ([#​84367](openclaw/openclaw#84367)) Thanks [@​joshavant](https://github.com/joshavant). - Agents/subagents: skip stale embedded-run wake probes for dormant completion requesters, so late subagent completions go straight to requester-agent/direct handoff instead of producing `reason=no_active_run` queue noise. ([#​82964](openclaw/openclaw#82964)) Thanks [@​galiniliev](https://github.com/galiniliev). - CLI: retry config snapshot reads after a transient failure so one rejected read no longer poisons later commands in the same process. ([#​83931](openclaw/openclaw#83931)) Thanks [@​honor2030](https://github.com/honor2030). - Media: decode URL path basenames before using them as remote media fallback filenames, so files like `My%20Report.pdf` are surfaced as `My Report.pdf`. Fixes [#​84050](openclaw/openclaw#84050). ([#​84052](openclaw/openclaw#84052)) Thanks [@​jbetala7](https://github.com/jbetala7). - WhatsApp: clarify inbound group diagnostics so observed but unregistered groups point to `channels.whatsapp.groups` without changing routing or sender authorization. ([#​83846](openclaw/openclaw#83846)) Thanks [@​neeravmakwana](https://github.com/neeravmakwana). - WhatsApp: drain pending outbound deliveries on a 30s periodic timer in addition to the reconnect handler, so messages enqueued while the provider is already connected no longer wait for the next reconnect to send. ([#​79083](openclaw/openclaw#79083)) Thanks [@​Oviemudiaga](https://github.com/Oviemudiaga). - CLI/TUI: include gateway plugin slash commands in TUI autocomplete, so connected sessions can suggest plugin-owned commands exposed by the running Gateway. ([#​83640](openclaw/openclaw#83640)) Thanks [@​se7en-agent](https://github.com/se7en-agent). - Gateway/mobile: restore QR setup-code handoff of bounded operator tokens for iOS and Android onboarding while keeping admin and pairing scopes out of bootstrap. ([#​83684](openclaw/openclaw#83684)) Thanks [@​ngutman](https://github.com/ngutman). - iOS: repair Release archive compilation for the TestFlight build. ([#​84255](openclaw/openclaw#84255)) Thanks [@​ngutman](https://github.com/ngutman). - Agents/compaction: bound plugin-owned CLI transcript compaction with the host safety timeout so a hung context engine can no longer stall post-turn cleanup. ([#​84083](openclaw/openclaw#84083)) Thanks [@​100yenadmin](https://github.com/100yenadmin). - Control UI/usage: truncate long context skill, tool, and file names in the usage panel while keeping the full name available on hover. ([#​42197](openclaw/openclaw#42197)) Thanks [@​Rain120](https://github.com/Rain120). - Codex: respect explicit `models auth order set` and `config.auth.order` precedence over stale `lastGood` in `/codex account`, and show `no working credential` when every explicit-order profile is ineligible instead of marking a lower-ranked profile as active. Fixes [#​84386](openclaw/openclaw#84386). ([#​84412](openclaw/openclaw#84412)) Thanks [@​openperf](https://github.com/openperf). - Agents: honor `messages.suppressToolErrors` for mutating tool failures so configured chat surfaces do not receive separate warning payloads. ([#​81561](openclaw/openclaw#81561)) Thanks [@​moeedahmed](https://github.com/moeedahmed). - Agents/fallback: surface billing guidance for mixed rate-limit plus billing fallback exhaustion instead of generic failure copy. Fixes [#​79396](openclaw/openclaw#79396). ([#​79489](openclaw/openclaw#79489)) Thanks [@​aayushprsingh](https://github.com/aayushprsingh). </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19--> Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/615
Bug
DEFAULT_VOID_HOOK_TIMEOUT_MS_BY_HOOKinsrc/plugins/hooks.tslisted onlyagent_end. Thebefore_compactionandafter_compactionplugin hooks arevoid hooks (
runBeforeCompaction/runAfterCompaction→runVoidHook(...)),and
runVoidHookapplies a timeout only whengetVoidHookTimeoutMsreturnsa value. With no table entry and no plugin-supplied
hook.timeoutMs, bothhooks ran fully unbounded.
In the codex agent harness these hooks fire on the serialized notification
queue —
extensions/codex/src/app-server/event-projector.tshandleItemStartedawaitsrunAgentHarnessBeforeCompactionHook(and the matchingafter_compactioncall site) for acontextCompactionitem. A slow or hungcompaction hook therefore freezes processing of every later codex notification —
including
turn/completed— so the whole agent turn hangs.Fix
Add
before_compactionandafter_compactionentries toDEFAULT_VOID_HOOK_TIMEOUT_MS_BY_HOOK, mirroring the defensive-default patternalready applied to
agent_end. The budget is 30s, matchingagent_end(the closest like-for-like precedent — a void lifecycle hook) rather than the
tighter 15s modifying-hook defaults, because compaction hooks can legitimately
do real work such as a memory flush. The runner is fail-open for void hooks,
so a timed-out handler is logged and compaction proceeds without it.
A plugin can still override the default per-registration via
hook.timeoutMs.Related
Related to #84077 — the same compaction-stall investigation. That issue covers
the host's missing safety timeout on plugin-owned
compact(); this is aseparate, independent mechanism (unbounded void hooks freezing the codex
notification queue), hence "Related" rather than "Fixes".
Test plan
pnpm tsgo:coreandpnpm tsgo:core:test— typecheck passesoxlinton changed files — 0 warnings / 0 errorssrc/plugins/hooks.compaction-timeout.test.ts:before_compactionhandler is bounded by the default timeout (no override) and logged, rather than hangingafter_compactionbefore_compactionhandler completes without a false-positive timeoutvitest(plugins project): new tests +hooks.correlation.test.ts+wired-hooks-compaction.test.ts— 15/15 passReal behavior proof
Behavior addressed: hung plugin
before_compactionandafter_compactionhandlers no longer stall compaction forever. The hook runner now applies the default fail-open timeout when no per-hook override is configured.Real environment tested: local OpenClaw checkout
/Volumes/LEXAR/repos/openclaw-fix-compaction-hook-timeout, branchfix/compaction-hook-default-timeout, Node vianode --import tsx, actualsrc/plugins/hooks.tsplugin hook runner.Exact steps or command run after the patch: ran a live script that registers a plugin whose
before_compactionandafter_compactionhooks never settle, then invokesrunHookTimeoutfor both hooks without per-hook timeout overrides.Evidence after fix:
{ "branch": "fix/compaction-hook-default-timeout", "behavior": "hung before_compaction and after_compaction hooks return via default fail-open timeout", "elapsedMs": 30033, "eventCount": 2, "events": [ "[hooks] before_compaction handler from live-proof-plugin failed: timed out after 30000ms", "[hooks] after_compaction handler from live-proof-plugin failed: timed out after 30000ms" ] }Observed result after fix: both hung hook handlers returned through the default timeout path in about 30 seconds each, emitted fail-open timeout events, and did not block the caller indefinitely.
What was not tested: a real external plugin process wedged in production; the proof uses the in-process hook runner with intentionally never-settling handlers.