FEA-1444: Codex hook ingestion + dedup + opt-in toggle by aeyeCEO · Pull Request #259 · closedloop-ai/closedloop-electron

aeyeCEO · 2026-05-29T20:29:33Z

Summary

End-to-end Codex hook ingestion for the Agent Monitor sidecar — three layers in one PR so the feature is user-visible on merge:

Layer 1 (commit 87a601c): Codex hook installer + payload handler + harness stamp. Mirrors the existing Claude hook installer, writes to ~/.codex/hooks.json, ships a small codex-hook-handler.js wrapper that injects __provider: "codex" so the sidecar can stamp harness='codex' on the session row.
Layer 2 (commit 8db7ef9): Event-level dedup against the existing codex-watcher.js rollout-tail importer. Hook events land first (real-time); rollout-tail's ~5s catchup no-ops on the duplicate. Without this, opting into hooks would create 2 rows per event.
Layer 3 (commit 8db7ef9): Renderer opt-in toggle directly below the existing Claude hooks toggle in the Agent Dashboard view + IPC handlers + preload bridge. User can flip it without editing JSON.

See FEA-1444 and PLN-767 v2.

What ships to users

New "Enable Codex CLI session tracking (experimental)" toggle in the Agent Dashboard consent banner, next to the Claude toggle.
When enabled, Codex sessions appear in the dashboard within 1s of SessionStart (vs ~5s typical from rollout-tail) and event timelines update in real time.
When disabled, behavior is identical to before this PR — rollout-tail watcher continues to import Codex sessions on its 5s tick.
Dedup prevents duplicate event rows when both hook stream and rollout-tail are active.

Test plan

pnpm -C apps/desktop typecheck — clean
pnpm -C apps/desktop lint — clean
pnpm -C apps/desktop build:agent-monitor — clean (Layer 1 hard-gates + SQLite gate pass)
Full suite: 1945/1945 pass, 0 fail
New test/agent-session-event-dedup.test.ts — 4/4 pass (sub-second drift, mixed events, error best-effort, empty no-op)
New test/agent-monitor-hooks-core.test.ts (Layer 1) — 10/10 pass
Two independent code-review passes (one per layer round) — all findings addressed in-PR
Manual smoke (recommended before merge): flip toggle on, run a Codex session, confirm: (1) ~/.codex/hooks.json populated, (2) session appears in dashboard with harness=codex instantly, (3) no duplicate event rows after rollout-tail catches up

Risks

Cross-second timestamp drift between hook and rollout-tail can defeat dedup → cosmetic duplicates. False-negative-preferred-over-false-positive tradeoff. Content-hash dedup key is a follow-up if drift duplicates show up in practice.
Codex hook feature flag is server-side. Hooks only fire if user sets [features].codex_hooks = true in ~/.codex/config.toml. The toggle's help text spells this out. Programmatic toggle of that Codex flag is intentionally out of scope (too version-sensitive).
agent-monitor-hooks.ts was split into electron-free agent-monitor-hooks-core.ts + electron-aware shell during Layer 1. This is a refactor of stable code; behavior verified by agent-monitor-wiring-static.test.ts (3 string-match assertions retargeted to the core file).
Version bump 0.15.99 → 0.15.100 (origin/main moved to 0.15.99 from PLN-750 + FEA-1437 while this branch was in development; rebased onto current main and bumped to 0.15.100 to leave room for FEA-1461 PR FEA-1461: Dead-letter agent-session sync on persistent rate_limited #258 at 0.15.99).

Out of scope (documented for follow-ups)

Daniel Levesque's workflow event taxonomy (agent_spawned, review_lane_failed, etc.). His Codex /goal orchestrator's state, not generic.
Loop orchestrator workflow events as first-class session events.
Removing the codex-watcher.js fallback. Stays for users on Codex versions without the experimental hook system.
Programmatic [features].codex_hooks = true toggle in Codex's config.toml.
Live-source badge on session cards distinguishing hook-sourced vs rollout-tail-sourced sessions.

🤖 Generated with Claude Code

- Add codex-hook-handler.js wrapper that mirrors upstream hook-handler.js semantics but injects __provider: "codex" so the sidecar can stamp the session row with harness='codex'. Zero-dep plain JS, fail-silent. - Split agent-monitor-hooks.ts: pure install/uninstall logic moved to new agent-monitor-hooks-core.ts (electron-free, testable under tsx --test). Shell retains electron-store + path resolution + gatewayLog. - Add installCodexHooks / uninstallCodexHooks targeting $CODEX_HOME/hooks.json with the narrower Codex hook event set (SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, Stop). SessionStart uses the "startup|resume" matcher per Codex's reference contract. - Filename-boundary predicate (isClaudeEntry vs isCodexEntry) prevents cross-matching between the two handler families. - New codexOptIn electron-store flag (default false) gates Codex hook installation. Behavior change: the master hooks toggle is unchanged for existing users; Codex hooks only install when both flags are true. Deferred to a follow-up: event-level dedup against the existing codex-watcher rollout-tail path, and the renderer UI to flip codexOptIn. - syncAgentMonitorHooksOnBoot reconciles Codex state with persisted intent (installs when opted in, uninstalls stale entries when opted out) so a failed file-layer mutation is eventually consistent on the next launch. - setAgentMonitorCodexHooksOptIn persists intent before applying side effects so a failure (e.g., locked hooks.json) cannot silently lose the user's opt-out. - build-agent-monitor.mjs gains a surgical patch to the generated server/routes/hooks.js that calls stmts.setSessionHarness when the inbound payload carries __provider: "codex". Patch is hash-anchored and hard-gated; future upstream drift fails the build loudly. - Bump desktop version to 0.15.98 per repo CI rule. Testing: - pnpm -C apps/desktop typecheck: clean - pnpm -C apps/desktop lint: clean - pnpm -C apps/desktop build:agent-monitor: clean (patch + hard-gates pass) - New test/agent-monitor-hooks-core.test.ts: 10/10 pass — covers predicate identity, matcher rules per harness, idempotent install, in-place self-heal of moved handler paths, narrow-scope uninstall, foreign-entry preservation, hooks-block cleanup, and Codex/Claude cross-installation independence. - test/agent-monitor-wiring-static.test.ts: 27/27 pass after retargeting three moved-string assertions to the new core file. - Full desktop test suite: 1825/1827 pass. The 2 remaining failures (python3 version-string parsing, telemetry healthcheck dedupe) are pre-existing flakes in unrelated files I did not touch. - Independent code-reviewer agent surfaced a Medium correctness finding (failed opt-out silently lost) and a Low note (function exported without IPC). Both addressed in this commit; the IPC wiring is an intentional follow-up. Risks: - Codex hooks are gated behind a new opt-in flag that has no UI yet, so default behavior is unchanged for all existing users. Risk surface is limited to anyone who flips the flag via dev console / tests until the UI ships. - The build-script patch depends on patchHooksTranscriptOutsideTx (FEA- 1363) running first; ordering enforced in materializeAgentMonitor and the patch is itself idempotent. A future upstream bump that drops the processEventCore anchor will fail the build hard rather than silently dropping the harness stamp. - The agent-monitor-hooks.ts → agent-monitor-hooks-core.ts split is a refactor of stable code; semantic equivalence verified by the existing wiring-static test plus the new core test.

Layer 2 — Event-level dedup against rollout-tail: - New filterEventsAlreadyCapturedByHooks in codex-import.js. Runs before importSession in importCodexSession; queries existing events for rows matching (session_id, event_type, COALESCE(tool_name, ''), created_at truncated to whole seconds) and drops duplicates. Hook-handler-inserted rows land first (real-time); the ~5s-later rollout-tail no-ops on match. - The dedup statement is cached per-dbModule via WeakMap so a batch of many changed sessions compiles the SQL once, not once per session. - Best-effort: any prepare/query failure is swallowed and the event is kept. False negatives (cosmetic duplicates) are preferred over false positives (silently dropped events). Sub-second timestamp drift handled by the substr(?, 1, 19) match. Layer 3 — Renderer opt-in toggle: - apps/desktop/src/main/app.ts: new IPC handlers desktop:{get,set}-agent-monitor-codex-hooks-opt-in that call the already-exported isAgentMonitorCodexHooksOptIn / setAgentMonitorCodexHooksOptIn. Gated on the master Agent Dashboard flag, matching the Claude pair shape exactly. - apps/desktop/src/main/preload.ts: mirror bridge methods exposing the two channels to the renderer. - apps/desktop/src/renderer/index.html: new #codexDashConsent toggle row directly below the existing #claudeDashConsent. Same visibility semantics (shown only on agent-settings route), same disabled-when- master-off behavior, same hint text reset on master flag re-enable (Codex hint reset now mirrors Claude's after a review-flagged gap). refreshCodexHooksToggle is called when the user toggles the Claude hooks toggle so the sibling's on/off badge stays in sync. Test: - apps/desktop/test/agent-session-event-dedup.test.ts (new): 4 unit tests on the filter against a sandbox SQLite DB. Covers (1) duplicate filter with sub-second drift, (2) non-duplicate survival across mixed events, (3) best-effort behavior on DB errors, (4) empty-events no-op. Uses the conditional-skip pattern (skipReason ? { skip: ... } : undefined) because Node test runner treats skip: null as "skip with no reason" rather than "don't skip" — gotcha learned during debugging. Independent code review (second pass on the cumulative diff) surfaced: - Medium: Codex hint stale after master flag re-enable (Claude hint was reset in the else branch, Codex hint wasn't). Fixed. - Low: Comment inaccuracy about which toggle "unblocks the master flag" in wireClaudeHooksToggle. Rewritten. - Low: Dedup statement prepared once per session inside the batch loop instead of once per process. Cached via WeakMap. All three addressed in this commit. Testing: - pnpm -C apps/desktop typecheck: clean - pnpm -C apps/desktop lint: clean - pnpm -C apps/desktop build:agent-monitor: clean (Layer 1 hard-gates + sidecar SQLite gate continue to pass) - pnpm -C apps/desktop test (full suite): 1945/1945 pass, 0 fail - New agent-session-event-dedup.test.ts: 4/4 pass - agent-monitor-hooks-core.test.ts (from Layer 1): 10/10 pass (no regression) Risks: - Cross-second timestamp drift between hook and rollout-tail can defeat dedup. Documented design tradeoff: false negatives are cosmetic duplicates; false positives would silently drop events. Will revisit with a content-hash dedup key in a follow-up if cosmetic duplication becomes visible in practice. - IPC handler test coverage deferred. Both new handlers are shallow plumbing that call already-tested functions (covered by Layer 1's agent-monitor-hooks-core.test.ts). Typecheck validates the signatures + preload bridge types. - Renderer JS duplicates the Claude/Codex toggle pattern. Per CLAUDE.md this is the kind of duplication worth extracting, but inline renderer JS makes a shared helper awkward; deferred to a separate cleanup ticket if the pattern repeats with a third harness.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8db7ef93b9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T20:34:24Z

+  stmt = dbModule.db.prepare(
+    "SELECT 1 FROM events " +
+      "WHERE session_id = ? AND event_type = ? " +
+      "AND COALESCE(tool_name, '') = COALESCE(?, '') " +
+      "AND substr(COALESCE(created_at, ''), 1, 19) = substr(COALESCE(?, ''), 1, 19) " +
+      "LIMIT 1",


Preserve rollout events when the existing row is not a hook duplicate

When Codex hooks are disabled, skipped, or miss one event, this query still treats any existing events row as a hook duplicate because it has no hook-source discriminator. In a live rollout file that is re-imported after a previous poll, a newly appended event with the same session id, event type, tool name, and second as an earlier rollout-imported event is filtered out before importSession() sees it, so rapid repeated tool calls can disappear from the timeline instead of being backfilled by the watcher.

Useful? React with 👍 / 👎.

Addressed in the latest commit on this branch. Gated the filter on whether ~/.codex/hooks.json actually contains our codex-hook-handler.js. When it doesn't, there cannot be any hook-sourced rows in the events table to dedup against — the filter is now a no-op in that case. This eliminates the false-positive vector for the ~99% of users who haven't opted into Codex hooks.

New test "dedup: gate-off when Codex hooks are not installed — filter is a no-op" seeds a row that the unguarded filter would have treated as a hook duplicate, points CODEX_HOME at a directory with no hooks.json, and asserts the incoming rollout-tail event survives.

Known v1 limitation (your cases (b) and (c) — hooks installed but a specific event missed or skipped): rapid same-second same-tool calls in that scenario can still collapse. The proper fix is a source column on the events table (so hook-inserted rows are explicitly discriminable from rollout-tail rows); that's invasive enough to merit its own FEA and will be filed as a follow-up.

chatgpt-codex-connector · 2026-05-29T20:34:24Z

+            hint.textContent = optedIn
+              ? "— on (Codex hooks installed in ~/.codex/hooks.json; also set [features].codex_hooks = true in ~/.codex/config.toml)"
+              : "— off (rollout-tail watcher continues)";


Tell users to trust the installed Codex hook

For current Codex, installing ~/.codex/hooks.json is not sufficient for this non-managed command hook to run: the official hook docs say non-managed command hooks are skipped until reviewed/trusted via /hooks (and hooks is the canonical feature key, with codex_hooks only a deprecated alias). With this hint, a user can turn the toggle on and follow the displayed config advice but still get no real-time hook ingestion, silently falling back to rollout-tail.

Useful? React with 👍 / 👎.

Fixed. Updated both the tooltip and the on/off hint text in index.html:

Feature key changed from [features].codex_hooks = true to the canonical [features].hooks = true.

The on-state hint now spells out the second requirement: "finish Codex setup: [features].hooks=true in ~/.codex/config.toml + run /hooks in Codex to trust the command".

The tooltip on the toggle itself mirrors the same two requirements.

Without these steps spelled out, a user enabling the toggle could legitimately think they were done and silently fall back to the rollout-tail watcher instead of getting real-time hook ingestion.

thadeusb

I have enough to write the review. Let me confirm my two findings are solid:

Codex P2 (dedup false-positive on rapid same-second events): - The codex-import.js filter had no hook-source discriminator: any existing events row matching (session_id, event_type, tool_name, created_at-truncated-to-second) was treated as a hook duplicate. For users who haven't opted into Codex hooks (~99% of installs), the row always came from a prior rollout-tail import, so the filter was silently dropping legitimate distinct events on re-imports of a changed rollout file. - Gate the filter on whether ~/.codex/hooks.json actually contains our codex-hook-handler.js. When it doesn't, there can't be any hook-sourced rows to dedup against, so the filter is a no-op. The detection result is cached at module level and exposed via resetCodexHooksInstalledCache for test setup. - Known v1 limitation: when hooks ARE installed but miss/skip a specific event (the reviewer's case (b)/(c)), rapid same-second same-tool calls in that session can still collapse. Tracked as a follow-up FEA for a real `source` column on the events table. Codex P2 (Codex hook config hint accuracy): - Codex's canonical feature flag is `[features].hooks`, not `[features].codex_hooks` (which is a deprecated alias). - For non-managed command hooks, Codex requires the user to trust the installed command via `/hooks` before they fire — without that, the install in ~/.codex/hooks.json is silently skipped and the user sees no real-time ingestion. - Hint text + tooltip updated to spell out both requirements so a user flipping the toggle doesn't end up in a silent-fallback state. New test: - agent-session-event-dedup.test.ts: "gate-off when Codex hooks are not installed — filter is a no-op". Seeds a row that would normally be treated as a hook duplicate, points CODEX_HOME at a directory with no hooks.json, and asserts the incoming rollout-tail event survives. Bump desktop version 0.15.100 → 0.15.102 (0.15.101 reserved for FEA-1461 PR #258). Testing: - pnpm -C apps/desktop typecheck: clean - pnpm -C apps/desktop lint: clean - pnpm -C apps/desktop build:agent-monitor: clean - agent-session-event-dedup.test.ts: 5/5 pass (was 4 before; new gate-off coverage) - Full suite: 1842 + 104 = 1946/1946 pass, 0 fail Risks: - Install detection cached at module load — a user opting into Codex hooks AFTER the sidecar started won't trigger dedup until the next app boot. Acceptable for v1 (the sidecar restarts on toggle change via agent-monitor-sidecar.ts anyway, so the cache is refreshed). - The install probe reads ~/.codex/hooks.json with a substring match for "codex-hook-handler.js". A user who manually edits hooks.json to a different filename containing that substring would trip the gate as a false positive. Not a realistic risk; substring match is intentionally lenient to handle both userData-relative and resources-relative paths.

origin/main moved to 0.15.101 while this branch was open; the version-bump-check on the previous push tripped because the branch was at the same 0.15.101. Bumping to 0.15.103 — 0.15.102 is held by FEA-1444 PR #259.

aeyeCEO · 2026-05-30T19:00:33Z

@shafty023 — passing this over to you for review and merge. Some extra context, since you're the original inspiration for the feature:

Background. PR #259 lands Codex hook ingestion into the Agent Monitor sidecar — the idea we borrowed from your closedloop-ai/workflow repo (telemetry/codex-hook-writer.js + telemetry/identifier-events.ts). Three layers in one PR so the feature is end-to-end user-visible on merge: hook installer + payload handler + harness stamp (Layer 1), event-level dedup against the existing rollout-tail importer (Layer 2), and a renderer opt-in toggle alongside the Claude hooks toggle (Layer 3).

Why you're the right reviewer. You wrote the upstream Codex hook code we adapted from, and you're the top contributor to closedloop-electron — so you have ground truth on both ends of this. Codex flagged two P2s on the latest commit; both are addressed in fabf164.

What's in the latest push.

P2 (Codex): dedup had no hook-source discriminator — gated the filter on actual install detection (~/.codex/hooks.json containing our handler). Documented v1 limitation for hooks-installed-but-missed events; tracked for a follow-up source column FEA.
P2 (Codex): hint text said [features].codex_hooks and missed the /hooks trust step — updated to canonical [features].hooks and spelled out the /hooks slash command requirement on both the tooltip and the on-state hint.

Handoff terms. Please review, fix anything you want directly on the branch, and merge yourself when you're satisfied. Don't pass back to me — I'm explicitly handing you merge authority on this one. If you see something that affects the broader telemetry direction (e.g., we should be doing the dedup the way your workflow repo does it), feel free to restructure; you have better context on that than I do. Slack me if you hit something that needs a product decision.

CI for the latest push is still propagating; should be green by the time you look. Thanks for taking it.

Andrew Eye added 2 commits May 29, 2026 15:07

aeyeCEO requested a review from a team May 29, 2026 20:29

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

thadeusb reviewed May 29, 2026

View reviewed changes

aeyeCEO requested a review from shafty023 May 30, 2026 19:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEA-1444: Codex hook ingestion + dedup + opt-in toggle#259

FEA-1444: Codex hook ingestion + dedup + opt-in toggle#259
aeyeCEO wants to merge 3 commits into
mainfrom
FEA-1444-codex-hooks

aeyeCEO commented May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

aeyeCEO May 30, 2026

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

aeyeCEO May 30, 2026

Uh oh!

thadeusb left a comment

Uh oh!

aeyeCEO commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aeyeCEO commented May 29, 2026

Summary

What ships to users

Test plan

Risks

Out of scope (documented for follow-ups)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

aeyeCEO May 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

aeyeCEO May 30, 2026

Choose a reason for hiding this comment

Uh oh!

thadeusb left a comment

Choose a reason for hiding this comment

Uh oh!

aeyeCEO commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants