feat: token cost tracking, data retention, domain verification & refactors#70
feat: token cost tracking, data retention, domain verification & refactors#70duyetbot wants to merge 12 commits intoduyet:mainfrom
Conversation
Add per-message cost fields (model, input_tokens, output_tokens, cost_microdollars) and conversation-level denormalized aggregates (total_cost_microdollars, total_tokens). All fields are optional for full backward compatibility. - Drizzle migration 0005 adds 6 new columns - V1 and V2 message creation accepts cost fields - Conversation aggregates auto-increment on message insert - Analytics summary includes total_cost_microdollars - Timeseries supports new "cost" metric - Tests cover create, append, and backward compat scenarios Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net> Co-Authored-By: Paperclip <noreply@paperclip.ing> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…analytics Add cost columns to conversation data tables (project + conversations pages), cost summary card to analytics dashboard, and cost trend chart. Shared formatCostMicrodollars utility handles spec-compliant formatting (<$0.01, $X.XX, $XXX). Updates shared types to include cost fields across the API contract. Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net> Co-Authored-By: Paperclip <noreply@paperclip.ing> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace auto-verify stub with real verification using three methods run in parallel: DNS TXT record via Cloudflare DoH, HTTP well-known file check, and HTML meta tag parsing. First success verifies the domain; all failures marks it as failed. Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net> Co-Authored-By: Paperclip <noreply@paperclip.ing>
…anup Projects can now set retention_days (1-3650) to auto-delete expired conversations. A Cloudflare Cron Trigger runs daily at 3 AM UTC, batch-deleting in order tags → messages → conversations to respect FK constraints. 25-second time budget prevents Worker timeout. Dashboard settings UI with validation and confirmation workflow. Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net>
Move bulk delete and export logic from routes into dedicated service file. Fix FK constraint order: conversation_tags → messages → conversations. Export uses batch queries to avoid N+1, selects only relevant fields. Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net>
Extract conversations-header and conversations-content into separate files for better separation of concerns and reusability. Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net>
Consolidate repeated node patterns into rectNode/dbNode helpers and organize path constants. No visual changes. Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net>
Fix version mismatch between esbuild host (0.27.4) and binary (0.19.12) that was preventing vitest from starting. Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net>
📝 WalkthroughWalkthroughAdds per-message token/cost columns and conversation aggregates, project retention with scheduled cleanup, domain verification with real network checks, bulk export/delete services, semantic search/embeddings, dashboard UI for cost/retention, shared type updates, and tests for retention and token-cost flows. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant DomainsService as Domains Service
participant Verifier as Verification Module
participant DNS as DNS-over-HTTPS
participant HTTP as Origin Server
participant Site as Site HTML
Client->>DomainsService: verifyDomain(domain, token)
DomainsService->>Verifier: checkDomainVerification(domain, token)
par DNS TXT check
Verifier->>DNS: fetch TXT _agentstate.{domain} (timeout 5s)
DNS-->>Verifier: TXT value / error
and HTTP file check
Verifier->>HTTP: GET https://{domain}/.well-known/agentstate-{token} (no redirects, timeout 5s)
HTTP-->>Verifier: body / error
and Meta tag check
Verifier->>Site: GET https://{domain}/ (no redirects, timeout 5s)
Site-->>Verifier: HTML / error
end
Verifier-->>DomainsService: { verified: bool, results: [...] }
DomainsService->>DB: conditional UPDATE (only if status in ["pending","failed"])
DomainsService-->>Client: persisted verification status & verified_at
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces token cost tracking and automated data retention features. It adds database columns for tracking model usage, input/output tokens, and costs at both message and conversation levels. A new scheduled worker task handles the cleanup of expired conversations based on project-specific retention settings. Additionally, the PR implements domain ownership verification via DNS TXT records, HTTP files, and HTML meta tags. Feedback was provided to simplify the meta tag regex logic by combining duplicated patterns.
| const pattern = | ||
| /<meta\s+(?:[^>]*?\s)?name\s*=\s*["']agentstate-verification["']\s+(?:[^>]*?\s)?content\s*=\s*["']([^"']+)["'][^>]*>/i; | ||
| const match = html.match(pattern); | ||
|
|
||
| if (!match) { | ||
| // Try reversed attribute order: content before name | ||
| const reversedPattern = | ||
| /<meta\s+(?:[^>]*?\s)?content\s*=\s*["']([^"']+)["']\s+(?:[^>]*?\s)?name\s*=\s*["']agentstate-verification["'][^>]*>/i; | ||
| const reversedMatch = html.match(reversedPattern); | ||
|
|
||
| if (!reversedMatch) { | ||
| return { method: "meta_tag", success: false, error: "Meta tag not found" }; | ||
| } | ||
|
|
||
| const success = reversedMatch[1] === expectedToken; | ||
| return { method: "meta_tag", success, error: success ? undefined : "Token mismatch in meta tag" }; | ||
| } | ||
|
|
||
| const success = match[1] === expectedToken; | ||
| return { method: "meta_tag", success, error: success ? undefined : "Token mismatch in meta tag" }; |
There was a problem hiding this comment.
The logic for matching the meta tag with attributes in different orders is duplicated. You can combine the two regular expressions into one using an OR | operator and check for the captured content in either of the capture groups. This simplifies the code and makes it more maintainable.
const pattern = /<meta\s+(?:[^>]*?\s)?name\s*=\s*["']agentstate-verification["']\s+(?:[^>]*?\s)?content\s*=\s*["']([^"']+)["'][^>]*>|<meta\s+(?:[^>]*?\s)?content\s*=\s*["']([^"']+)["']\s+(?:[^>]*?\s)?name\s*=\s*["']agentstate-verification["'][^>]*>/i;
const match = html.match(pattern);
if (!match) {
return { method: "meta_tag", success: false, error: "Meta tag not found" };
}
const content = match[1] || match[2];
const success = content === expectedToken;
return { method: "meta_tag", success, error: success ? undefined : "Token mismatch in meta tag" };There was a problem hiding this comment.
Actionable comments posted: 16
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
packages/dashboard/src/app/dashboard/project/_data-tab.tsx (1)
40-45:⚠️ Potential issue | 🔴 CriticalFix the type mismatch causing the pipeline failure.
The
{ key: "expand", label: "" }object literal has itskeyinferred asstringrather thanColumnKey, causing the TypeScript error shown in the pipeline. Useas constto narrow the literal type.🔧 Proposed fix
const columns = [ - { key: "expand", label: "" }, + { key: "expand" as const, label: "" }, ...allColumns .filter((c) => visibleCols.includes(c.key)) .map((col) => ({ key: col.key, label: col.label })), ];🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/dashboard/src/app/dashboard/project/_data-tab.tsx` around lines 40 - 45, The pipeline fails because the literal { key: "expand", label: "" } is inferred as a plain string key instead of the ColumnKey union; update the literal so its key is narrowed to ColumnKey—e.g. replace the inline object with a typed narrow form like { key: "expand" as ColumnKey, label: "" } (or use { key: "expand", label: "" } as const and then cast to the expected Column type) before spreading with allColumns/visibleCols in the columns array to fix the type mismatch in columns.packages/api/src/services/v2-projects.ts (1)
382-403:⚠️ Potential issue | 🟠 MajorPreserve an explicit
nullin the update response.If the caller clears retention with
retention_days: null, Line 403 falls back toexisting.retentionDaysbecause??treats the persistednullas missing. The update succeeds, but the response reports the stale value.Suggested fix
- retention_days: updated?.retentionDays ?? existing.retentionDays ?? null, + retention_days: + updated?.retentionDays === null + ? null + : updated?.retentionDays ?? existing.retentionDays ?? null,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/api/src/services/v2-projects.ts` around lines 382 - 403, The response currently uses the nullish coalescing operator which treats a persisted null as missing; update the retention_days expression to detect undefined rather than null. Replace the existing expression using updated?.retentionDays ?? existing.retentionDays ?? null with a conditional that checks for undefined, e.g. retention_days: updated?.retentionDays !== undefined ? updated.retentionDays : existing.retentionDays ?? null, so an explicit null from updated is preserved in the response (referencing the variables updated, existing, and retentionDays).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/api/drizzle/0006_retention_days.sql`:
- Line 1: The migration adds projects.retention_days without a constraint,
allowing values outside 1–3650; modify the migration to create/alter the
projects table to include a CHECK constraint enforcing retention_days BETWEEN 1
AND 3650 (e.g., add CHECK (retention_days >= 1 AND retention_days <= 3650) with
a clear constraint name like chk_projects_retention_days_range), and handle
existing NULLs/invalid rows beforehand (backfill or set a safe default) so the
constraint can be applied without failing.
In `@packages/api/drizzle/meta/_journal.json`:
- Around line 41-45: The migration journal (_journal.json) is missing an entry
for the new migration 0006_retention_days referenced by 0006_retention_days.sql;
update the journal so SQL and metadata stay in sync by regenerating migrations
metadata from the packages/api/drizzle/ directory or manually add an entry with
idx 6 and tag "0006_retention_days" (matching the existing shape that includes
version, when, tag, and breakpoints similar to the existing
"0005_token_cost_tracking" entry).
In `@packages/api/package.json`:
- Around line 22-24: Replace the platform-specific devDependency
"@rollup/rollup-linux-x64-gnu" with the generic "rollup" package in package.json
devDependencies: remove the "@rollup/rollup-linux-x64-gnu" entry and add
"rollup" (e.g. version "^4.60.0") so Rollup's platform-specific optional
binaries resolve correctly across macOS/Windows/Linux; ensure package.json keys
match "rollup" and run your package manager install to update lockfile.
In `@packages/api/src/db/schema.ts`:
- Line 34: The retention_days column currently defined as retentionDays:
integer("retention_days") needs a DB-level CHECK enforcing 1..3650 to prevent
invalid writes; update packages/api/src/db/schema.ts to attach a check
constraint to the column/table (e.g., a CHECK (retention_days BETWEEN 1 AND
3650) via the schema builder or column.check/constraint API) and add/emit a
corresponding migration so the database enforces the rule for all write paths;
ensure the constraint name is unique and reference retentionDays/retention_days
in the change.
In `@packages/api/src/lib/domain-verification.ts`:
- Around line 54-57: Add a hard guard in the verification entrypoints
(verifyDnsTxt, verifyTxtRecord, verifyHttpTxt, verifyHttpFile) to reject
single-label hosts and IP literals before performing any network ops: detect
hostnames without a dot or purely numeric IP strings and throw/return a
validation error; additionally perform a DNS resolution for A/AAAA records and
reject any resolved addresses that are private/reserved (RFC1918, RFC6598,
link-local, multicast, loopback, etc.) so the subsequent HTTP/DNS checks never
target non-public addresses. Ensure the checks run synchronously at the top of
each function (before HTTP fetches or TXT queries) and return the same
VerificationCheckResult/error shape used elsewhere.
In `@packages/api/src/routes/v2/projects/crud.ts`:
- Around line 35-38: The UpdateProjectSchema currently allows an empty object so
a PATCH with {} bypasses validation; update the schema (UpdateProjectSchema) to
require at least one present field by adding a refinement (e.g., .refine or
.superRefine) that checks Object.keys(data).length > 0 and returns a clear error
message like "At least one field is required", and also add a runtime guard
before calling updateProject that rejects empty payloads (check if
Object.keys(data).length === 0) to avoid passing a no-op update to
updateProject; reference the UpdateProjectSchema and the updateProject call to
locate where to apply these changes.
In `@packages/api/src/scheduled.ts`:
- Around line 5-21: Wrap the call to cleanupExpiredConversations inside
onScheduled in a try-catch: call drizzle(env.DB) and then try awaiting
cleanupExpiredConversations(db, env.DB); on success keep the existing structured
console.log; on error catch and emit a structured error log (include event.cron,
any inputs like env.DB identifier, and error.stack or message) using
console.error (or your app logger) for observability, then rethrow or return a
rejected promise so the runtime knows the job failed.
In `@packages/api/src/services/bulk.ts`:
- Around line 111-120: The fetch currently truncates combined messages with
.limit(5000) on the allMsgs query (convIds, messages, MESSAGE_EXPORT_FIELDS),
causing incomplete exports; remove the hard 5000 cap and implement proper paging
or explicit failure: either iterate over convIds and fetch messages per
conversation in paged queries (limit/offset or cursor) aggregating results and
accurate message_count, or run batched queries by convIds in slices and/or a
COUNT(*) per conversation to detect when total messages exceed an export-safe
threshold and return a clear error instead of partial data; apply the same fix
where the other message fetch block (lines referenced 138-149) uses a 5000
limit.
- Around line 46-63: CONVERSATION_EXPORT_FIELDS and MESSAGE_EXPORT_FIELDS are
currently omitting persisted columns (external_id, token_count, message.metadata
and the new accounting columns) and later replacements are emitting null/0,
which drops real data; update the field maps so each export key maps to the
actual DB column expressions (e.g., include conversations.externalId,
conversations.tokenCount, messages.metadata, messages.tokenCount and any
accounting columns by their column identifiers) instead of hardcoded null/0, and
remove the code paths that overwrite them with constants so the export
serializes the real stored values for CONVERSATION_EXPORT_FIELDS and
MESSAGE_EXPORT_FIELDS (also apply the same fixes where these constants are used
around the other blocks referenced).
- Line 3: The bulk delete sequence deletes messages and conversations
(references: messages, conversations) but never clears conversation_tags,
causing FK failures; update the deletion order to first remove conversation_tags
rows for the target conversation IDs (delete from conversation_tags where
conversation_id IN ...), then delete messages, then delete conversations,
ensuring the code that currently calls messages.delete and conversations.delete
is preceded by a conversation_tags deletion step (use the existing DB/table
symbol for conversation_tags).
In `@packages/api/src/services/domains.ts`:
- Around line 314-327: verifyDomain() currently updates customDomains by id
unconditionally which lets a slow failure overwrite a concurrent successful
verification; change the failure-path update to include the prior-state guard
(e.g., in the db.update(...).where(...) add an extra condition on
customDomains.verificationStatus to only update when still in the non-verified
expected state such as 'pending' or not 'verified'), and after performing the
update re-query the row (select from customDomains by domainId) and return the
freshly read row fields (verification_status / verified_at) instead of the
pre-read domain object to ensure the response reflects the final persisted
state.
In `@packages/api/src/services/retention.ts`:
- Around line 25-28: The function cleanupExpiredConversations currently declares
an unused parameter _d1Db of type D1Database; remove this parameter from the
function signature and all call sites (and any corresponding imports or type
references) to clean the interface, or if you intend to keep it for future use,
add a concise comment above the parameter or the function noting it's reserved
for future D1-specific operations (e.g., "// reserved for future D1Database
use") and keep the underscore to signal intentional unused status; update
tests/consumers accordingly to match the new signature.
- Around line 99-104: The results.push currently uses the global start variable
so durationMs measures cumulative time; before processing each project (the loop
handling project.id and computing totalDeleted and retentionDays) record a
per-project start timestamp (e.g., projectStart = Date.now()) and then set
durationMs to Date.now() - projectStart in the results.push call so each entry
reflects that project's elapsed processing time rather than the global start
time.
In `@packages/api/test/retention.test.ts`:
- Around line 88-92: The retention tests are relying on shared DB state seeded
in beforeAll (applyMigrations and seed) which makes cases order-dependent;
change the suite so each test is independent by resetting and reseeding the DB
in a beforeEach (call the same applyMigrations/seed helpers or use a
clearDB+seed helper) or by making each test create and use unique fixtures
within the test itself (so no test reads rows left by another). Update the
describe("Retention Service") suite to replace the shared beforeAll seed with
per-test setup/teardown (or explicit unique data creation) and remove
assumptions that rows from other tests exist (the tests that currently expect
leftover rows should instead create the exact rows they assert against).
In `@packages/dashboard/src/app/dashboard/project/_retention-settings.tsx`:
- Around line 25-32: handleSave currently converts retentionDays to a Number and
only checks range, which allows fractional days; update the validation in
handleSave to also require an integer by adding a Number.isInteger(value) check
(i.e., treat non-integer values as invalid) and show the same toast.error
message when the value is not an integer, null/empty remains allowed as
infinite.
In
`@packages/dashboard/src/app/dashboard/project/data-tab/conversations-header.tsx`:
- Line 2: Remove the unused import ColumnKey from conversations-header.tsx:
locate the import statement "import type { ColumnKey } from \"../_types\";" and
delete it so the file no longer imports this unused type; ensure no other
references to ColumnKey remain in the file before committing.
---
Outside diff comments:
In `@packages/api/src/services/v2-projects.ts`:
- Around line 382-403: The response currently uses the nullish coalescing
operator which treats a persisted null as missing; update the retention_days
expression to detect undefined rather than null. Replace the existing expression
using updated?.retentionDays ?? existing.retentionDays ?? null with a
conditional that checks for undefined, e.g. retention_days:
updated?.retentionDays !== undefined ? updated.retentionDays :
existing.retentionDays ?? null, so an explicit null from updated is preserved in
the response (referencing the variables updated, existing, and retentionDays).
In `@packages/dashboard/src/app/dashboard/project/_data-tab.tsx`:
- Around line 40-45: The pipeline fails because the literal { key: "expand",
label: "" } is inferred as a plain string key instead of the ColumnKey union;
update the literal so its key is narrowed to ColumnKey—e.g. replace the inline
object with a typed narrow form like { key: "expand" as ColumnKey, label: "" }
(or use { key: "expand", label: "" } as const and then cast to the expected
Column type) before spreading with allColumns/visibleCols in the columns array
to fix the type mismatch in columns.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1e27dc5f-a549-4dc9-893c-e0524a782a39
⛔ Files ignored due to path filters (1)
bun.lockis excluded by!**/*.lock
📒 Files selected for processing (49)
packages/api/drizzle/0005_token_cost_tracking.sqlpackages/api/drizzle/0006_retention_days.sqlpackages/api/drizzle/meta/_journal.jsonpackages/api/package.jsonpackages/api/src/db/schema.tspackages/api/src/index.tspackages/api/src/lib/domain-verification.tspackages/api/src/lib/helpers.tspackages/api/src/lib/serialization.tspackages/api/src/lib/validation.tspackages/api/src/routes/analytics.tspackages/api/src/routes/conversations/bulk.tspackages/api/src/routes/conversations/crud.tspackages/api/src/routes/v2/projects/crud.tspackages/api/src/scheduled.tspackages/api/src/services/analytics.tspackages/api/src/services/bulk.tspackages/api/src/services/conversations.tspackages/api/src/services/domains.tspackages/api/src/services/messages.tspackages/api/src/services/projects.tspackages/api/src/services/retention.tspackages/api/src/services/v2-conversations.tspackages/api/src/services/v2-projects.tspackages/api/test/retention.test.tspackages/api/test/setup.tspackages/api/test/v2-conversations.test.tspackages/api/wrangler.jsoncpackages/dashboard/src/app/dashboard/analytics/_analytics-content.tsxpackages/dashboard/src/app/dashboard/conversations/_conversation-row.tsxpackages/dashboard/src/app/dashboard/conversations/_table-columns.tsxpackages/dashboard/src/app/dashboard/project/_components.tsxpackages/dashboard/src/app/dashboard/project/_conversation-cell-renderers.tsxpackages/dashboard/src/app/dashboard/project/_conversation-columns.tspackages/dashboard/src/app/dashboard/project/_data-tab-state.tspackages/dashboard/src/app/dashboard/project/_data-tab.tsxpackages/dashboard/src/app/dashboard/project/_project-settings.tsxpackages/dashboard/src/app/dashboard/project/_retention-settings.tsxpackages/dashboard/src/app/dashboard/project/_types.tspackages/dashboard/src/app/dashboard/project/data-tab/conversations-content.tsxpackages/dashboard/src/app/dashboard/project/data-tab/conversations-header.tsxpackages/dashboard/src/app/dashboard/project/data-tab/index.tspackages/dashboard/src/app/dashboard/project/page.tsxpackages/dashboard/src/components/analytics/area-chart.tsxpackages/dashboard/src/components/analytics/summary-cards.tsxpackages/dashboard/src/components/landing/_hero-svg-factories.tsxpackages/dashboard/src/lib/constants.tspackages/dashboard/src/lib/format-cost.tspackages/shared/src/index.ts
| @@ -0,0 +1 @@ | |||
| ALTER TABLE `projects` ADD `retention_days` integer; | |||
There was a problem hiding this comment.
Enforce retention bounds at the database layer.
retention_days is added without a CHECK, so out-of-range values can be stored despite the 1–3650 requirement.
Proposed fix
-ALTER TABLE `projects` ADD `retention_days` integer;
+ALTER TABLE `projects`
+ADD `retention_days` integer
+CHECK (`retention_days` IS NULL OR `retention_days` BETWEEN 1 AND 3650);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ALTER TABLE `projects` ADD `retention_days` integer; | |
| ALTER TABLE `projects` ADD `retention_days` integer CHECK (`retention_days` IS NULL OR `retention_days` BETWEEN 1 AND 3650); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/drizzle/0006_retention_days.sql` at line 1, The migration adds
projects.retention_days without a constraint, allowing values outside 1–3650;
modify the migration to create/alter the projects table to include a CHECK
constraint enforcing retention_days BETWEEN 1 AND 3650 (e.g., add CHECK
(retention_days >= 1 AND retention_days <= 3650) with a clear constraint name
like chk_projects_retention_days_range), and handle existing NULLs/invalid rows
beforehand (backfill or set a safe default) so the constraint can be applied
without failing.
| "idx": 5, | ||
| "version": "6", | ||
| "when": 1774764000000, | ||
| "tag": "0005_token_cost_tracking", | ||
| "breakpoints": true |
There was a problem hiding this comment.
Migration journal is missing the 0006_retention_days entry.
This PR includes 0006_retention_days.sql, but _journal.json only adds 0005. Please regenerate migrations metadata (or add idx 6) so the journal and SQL files stay in sync.
Expected journal shape (illustrative)
{
"idx": 5,
"version": "6",
"when": 1774764000000,
"tag": "0005_token_cost_tracking",
"breakpoints": true
+ },
+ {
+ "idx": 6,
+ "version": "6",
+ "when": <generated_timestamp>,
+ "tag": "0006_retention_days",
+ "breakpoints": true
}As per coding guidelines "Commit generated SQL migrations from 'packages/api/drizzle/' directory to git".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/drizzle/meta/_journal.json` around lines 41 - 45, The migration
journal (_journal.json) is missing an entry for the new migration
0006_retention_days referenced by 0006_retention_days.sql; update the journal so
SQL and metadata stay in sync by regenerating migrations metadata from the
packages/api/drizzle/ directory or manually add an entry with idx 6 and tag
"0006_retention_days" (matching the existing shape that includes version, when,
tag, and breakpoints similar to the existing "0005_token_cost_tracking" entry).
| "@rollup/rollup-linux-x64-gnu": "^4.60.0", | ||
| "drizzle-kit": "^0.30.0", | ||
| "esbuild": "0.27.4", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify platform constraints and whether current deps are OS-specific.
npm view `@rollup/rollup-linux-x64-gnu` os cpu optionalDependencies
npm view rollup optionalDependenciesRepository: duyet/agentstate
Length of output: 1258
🏁 Script executed:
cd packages/api && cat -n package.json | head -40Repository: duyet/agentstate
Length of output: 1082
Remove platform-specific Rollup binary from devDependencies.
@rollup/rollup-linux-x64-gnu is restricted to Linux x64 systems and will cause dependency resolution failures on macOS and Windows developer machines. Replace with rollup, which resolves the appropriate platform binary via its optional dependencies.
Proposed fix
- "@rollup/rollup-linux-x64-gnu": "^4.60.0",
+ "rollup": "^4.60.0",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "@rollup/rollup-linux-x64-gnu": "^4.60.0", | |
| "drizzle-kit": "^0.30.0", | |
| "esbuild": "0.27.4", | |
| "rollup": "^4.60.0", | |
| "drizzle-kit": "^0.30.0", | |
| "esbuild": "0.27.4", |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/package.json` around lines 22 - 24, Replace the
platform-specific devDependency "@rollup/rollup-linux-x64-gnu" with the generic
"rollup" package in package.json devDependencies: remove the
"@rollup/rollup-linux-x64-gnu" entry and add "rollup" (e.g. version "^4.60.0")
so Rollup's platform-specific optional binaries resolve correctly across
macOS/Windows/Linux; ensure package.json keys match "rollup" and run your
package manager install to update lockfile.
| .references(() => organizations.id), | ||
| name: text("name").notNull(), | ||
| slug: text("slug").notNull(), | ||
| retentionDays: integer("retention_days"), // NULL = infinite retention |
There was a problem hiding this comment.
Enforce the retention_days range in the schema.
The PR defines this field as 1–3650, but Line 34 currently accepts 0, negatives, and arbitrarily large values. A bad write here can make the retention job delete data immediately or never expire it. Please add a DB-level CHECK in the schema/migration so every write path shares the same guardrail.
As per coding guidelines, "Maintain database schema in 'packages/api/src/db/schema.ts' as single source of truth".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/src/db/schema.ts` at line 34, The retention_days column
currently defined as retentionDays: integer("retention_days") needs a DB-level
CHECK enforcing 1..3650 to prevent invalid writes; update
packages/api/src/db/schema.ts to attach a check constraint to the column/table
(e.g., a CHECK (retention_days BETWEEN 1 AND 3650) via the schema builder or
column.check/constraint API) and add/emit a corresponding migration so the
database enforces the rule for all write paths; ensure the constraint name is
unique and reference retentionDays/retention_days in the change.
| export async function verifyDnsTxt( | ||
| domain: string, | ||
| expectedToken: string, | ||
| ): Promise<VerificationCheckResult> { |
There was a problem hiding this comment.
Reject non-public targets before starting verification.
All three checks trust a stored domain as a network target. Right now the service layer only normalizes input, and its validator still accepts values like localhost and IPv4 literals, so this path can be used against non-public hosts. Add a hard guard here for single-label hosts/IP literals, and ideally resolve/reject private or reserved addresses before the HTTP checks run.
Also applies to: 97-100, 128-131, 176-179
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/src/lib/domain-verification.ts` around lines 54 - 57, Add a hard
guard in the verification entrypoints (verifyDnsTxt, verifyTxtRecord,
verifyHttpTxt, verifyHttpFile) to reject single-label hosts and IP literals
before performing any network ops: detect hostnames without a dot or purely
numeric IP strings and throw/return a validation error; additionally perform a
DNS resolution for A/AAAA records and reject any resolved addresses that are
private/reserved (RFC1918, RFC6598, link-local, multicast, loopback, etc.) so
the subsequent HTTP/DNS checks never target non-public addresses. Ensure the
checks run synchronously at the top of each function (before HTTP fetches or TXT
queries) and return the same VerificationCheckResult/error shape used elsewhere.
Widen the columns type to accept ColumnKey | "expand" so the expand toggle column doesn't widen the whole array to string, which fails CI typecheck. Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net> Co-Authored-By: Paperclip <noreply@paperclip.ing> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CI Fix: Type error in `_data-tab.tsx` column constructionThe
Fix (2 files):1. 2. This passes local typecheck. Ready for someone with push access to cherry-pick onto the branch. |
- Add isSafeVerificationTarget guard to reject single-label hosts and IP literals before making outbound verification requests (SSRF) - Add optimistic concurrency check on verifyDomain — only updates rows still in pending/failed status, then re-queries for actual state - Fix ColumnKey type inference in _data-tab.tsx (as const on expand) Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net> Co-Authored-By: Paperclip <noreply@paperclip.ing>
CI is green — ready to mergeAll checks pass:
This PR includes Phase 1 deliverables:
Requesting merge approval from maintainers. |
Build the Memory Layer (Phase 2 Part 1+2): - Embedding service (embeddings.ts): generate 768d vectors via @cf/baai/bge-m3, upsert/query/delete via Vectorize - Semantic search endpoint: POST /v2/conversations/search with project-scoped vector queries, ranked results with relevance scores - Context retrieval endpoint: POST /v2/conversations/context with token-budget packing for LLM injection - Wire embedding on message creation (fire-and-forget via waitUntil) - Add VECTORIZE_INDEX binding to wrangler config + Bindings type - Shared types: SemanticSearchRequest/Response, ContextRetrieval types Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net> Co-Authored-By: Paperclip <noreply@paperclip.ing> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
packages/api/src/lib/domain-verification.ts (1)
44-52:⚠️ Potential issue | 🔴 CriticalThe network-target hardening is still incomplete.
isSafeVerificationTarget()only validates the original string, while both HTTP verifiers blindly follow redirects. A public hostname can still resolve, or redirect, to loopback/link-local/RFC1918 targets, so verification can still hit internal services. Resolve the hostname to public A/AAAA records beforefetch, and re-check every redirect target instead of trustingredirect: "follow".Also applies to: 123-126, 157-160
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/api/src/lib/domain-verification.ts` around lines 44 - 52, The current isSafeVerificationTarget(domain) only inspects the input string but the HTTP verifiers follow redirects and never re-validate targets; update the verification flow used around the HTTP fetch logic (the functions calling fetch in this file — replace any fetch(..., { redirect: "follow" }) usage) to: 1) perform DNS resolution of the hostname to A/AAAA records and reject if any resolved address is loopback, link-local, or RFC1918/private; 2) perform the HTTP request with redirects disabled (no automatic follow) and, on receiving a 3xx with Location, parse the redirect target, resolve its hostname to A/AAAA and re-run the same IP checks before following; 3) repeat manual redirect handling up to the existing redirect limit; 4) keep isSafeVerificationTarget(domain) as a syntactic check but move network-safety decisions to the DNS-and-IP checks described above so redirects and final targets are always validated before any connection is made.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/api/src/lib/domain-verification.ts`:
- Around line 203-210: The current use of Promise.all delays returning verified
true until every verifier finishes; change to short-circuit for the boolean by
creating per-probe promises that resolve only on success (and reject on failure)
and use Promise.any on [verifyDnsTxt, verifyHttpFile, verifyMetaTag] to compute
verified quickly, while concurrently running the full probes (e.g., via
Promise.allSettled or the existing Promise.all) to collect per-method results;
update the code that sets verified (currently using results.some(...)) to use
the Promise.any outcome (fall back to false if Promise.any rejects) and still
return the full results array for callers that need details.
- Around line 47-50: The existing IPv6 heuristic in domain-verification.ts
incorrectly treats hex-only hostnames like `bead.cafe` as IPv6 by testing
/^[0-9a-f:]+$/i against domain.replace(/\./g, ""), so change the logic to only
apply the IPv6-literal test when the original host actually looks like an IPv6
literal (e.g. it starts with "[" or contains a ":"), rather than unconditionally
after removing dots; update the condition around domain.startsWith("[") and the
/^[0-9a-f:]+$/i check to first check domain.includes(":") (or
domain.startsWith("[")) before running the hex/colon regex or, better, replace
the heuristic with a call to a proper IP check (e.g. net.isIP) to decide if the
input is an IP literal.
In `@packages/api/src/services/domains.ts`:
- Around line 329-331: The code currently falls back to the stale pre-read
domain when re-querying returns null; change the logic around
getDomain(domainId, projectId) so that if fresh is null you do not return the
old domain but instead surface a not-found error (or return null) to the caller;
specifically update the block that calls getDomain and buildVerificationResult
to only call buildVerificationResult(fresh) when fresh exists and otherwise
throw/return an appropriate not-found response for the domainId/projectId rather
than using the stale domain variable.
In `@packages/dashboard/src/app/dashboard/project/_data-tab.tsx`:
- Around line 40-45: Create a shared column descriptor type in _types.ts (e.g.,
export type ColumnDescriptor = { key: ColumnKey | "expand"; label: string }) and
replace the inline type annotation Array<{ key: ColumnKey | "expand"; label:
string }> used in the columns declaration and the similar declaration in
data-tab/conversations-content.tsx with this exported ColumnDescriptor (or
Array<ColumnDescriptor>), updating imports to import { ColumnDescriptor } from
"_types" so both call sites use the single source of truth.
---
Duplicate comments:
In `@packages/api/src/lib/domain-verification.ts`:
- Around line 44-52: The current isSafeVerificationTarget(domain) only inspects
the input string but the HTTP verifiers follow redirects and never re-validate
targets; update the verification flow used around the HTTP fetch logic (the
functions calling fetch in this file — replace any fetch(..., { redirect:
"follow" }) usage) to: 1) perform DNS resolution of the hostname to A/AAAA
records and reject if any resolved address is loopback, link-local, or
RFC1918/private; 2) perform the HTTP request with redirects disabled (no
automatic follow) and, on receiving a 3xx with Location, parse the redirect
target, resolve its hostname to A/AAAA and re-run the same IP checks before
following; 3) repeat manual redirect handling up to the existing redirect limit;
4) keep isSafeVerificationTarget(domain) as a syntactic check but move
network-safety decisions to the DNS-and-IP checks described above so redirects
and final targets are always validated before any connection is made.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 016a0baf-380e-4c79-adcd-743079715f48
📒 Files selected for processing (4)
packages/api/src/lib/domain-verification.tspackages/api/src/services/domains.tspackages/dashboard/src/app/dashboard/project/_data-tab.tsxpackages/dashboard/src/app/dashboard/project/data-tab/conversations-content.tsx
| const results = await Promise.all([ | ||
| verifyDnsTxt(domain, expectedToken), | ||
| verifyHttpFile(domain, expectedToken), | ||
| verifyMetaTag(domain, expectedToken), | ||
| ]); | ||
|
|
||
| const verified = results.some((r) => r.success); | ||
| return { verified, results }; |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Promise.all keeps successful verifications waiting on the slowest probe.
A fast DNS success still waits for the HTTP checks to finish or time out because Promise.all only resolves after every verifier settles. verifyDomain() only uses the boolean, so consider a short-circuit helper for that path and keep full per-method results only where they are needed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/src/lib/domain-verification.ts` around lines 203 - 210, The
current use of Promise.all delays returning verified true until every verifier
finishes; change to short-circuit for the boolean by creating per-probe promises
that resolve only on success (and reject on failure) and use Promise.any on
[verifyDnsTxt, verifyHttpFile, verifyMetaTag] to compute verified quickly, while
concurrently running the full probes (e.g., via Promise.allSettled or the
existing Promise.all) to collect per-method results; update the code that sets
verified (currently using results.some(...)) to use the Promise.any outcome
(fall back to false if Promise.any rejects) and still return the full results
array for callers that need details.
| const columns: Array<{ key: ColumnKey | "expand"; label: string }> = [ | ||
| { key: "expand" as const, label: "" }, | ||
| ...allColumns | ||
| .filter((c) => visibleCols.includes(c.key)) | ||
| .map((col) => ({ key: col.key, label: col.label })), | ||
| ]; |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Extract a shared column descriptor type to prevent drift.
Array<{ key: ColumnKey | "expand"; label: string }> is duplicated here and in data-tab/conversations-content.tsx; centralizing this in _types.ts will keep both call sites aligned.
♻️ Suggested refactor
diff --git a/packages/dashboard/src/app/dashboard/project/_types.ts b/packages/dashboard/src/app/dashboard/project/_types.ts
@@
export type ColumnKey =
| "title"
| "external_id"
| "message_count"
| "token_count"
| "total_cost"
| "total_tokens"
| "metadata"
| "created_at"
| "updated_at";
+
+export type DataTableColumnKey = ColumnKey | "expand";
+export type DataTableColumn = { key: DataTableColumnKey; label: string };
diff --git a/packages/dashboard/src/app/dashboard/project/_data-tab.tsx b/packages/dashboard/src/app/dashboard/project/_data-tab.tsx
@@
-import type { ColumnKey } from "./_types";
+import type { ColumnKey, DataTableColumn } from "./_types";
@@
- const columns: Array<{ key: ColumnKey | "expand"; label: string }> = [
+ const columns: DataTableColumn[] = [
diff --git a/packages/dashboard/src/app/dashboard/project/data-tab/conversations-content.tsx b/packages/dashboard/src/app/dashboard/project/data-tab/conversations-content.tsx
@@
-import type { ColumnKey } from "../_types";
+import type { ColumnKey, DataTableColumn } from "../_types";
@@
- columns: Array<{ key: ColumnKey | "expand"; label: string }>;
+ columns: DataTableColumn[];🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/dashboard/src/app/dashboard/project/_data-tab.tsx` around lines 40 -
45, Create a shared column descriptor type in _types.ts (e.g., export type
ColumnDescriptor = { key: ColumnKey | "expand"; label: string }) and replace the
inline type annotation Array<{ key: ColumnKey | "expand"; label: string }> used
in the columns declaration and the similar declaration in
data-tab/conversations-content.tsx with this exported ColumnDescriptor (or
Array<ColumnDescriptor>), updating imports to import { ColumnDescriptor } from
"_types" so both call sites use the single source of truth.
P1 Security: - Block redirect-based SSRF in HTTP/meta_tag verification (redirect: "error") - Fix IPv6 false positive rejecting domains like "bead.cafe" P1 Data Integrity: - Add conversation_tags cascade delete in bulk operations - Remove silent 5000-row truncation from bulk message export - Include external_id, token_count, metadata, accounting fields in exports P2 Correctness: - Fix retention_days null handling with explicit undefined check - Error on stale domain verify re-query instead of silent fallback - Reject empty PATCH body on project update (schema refine) - Add Number.isInteger() check for fractional retention days P3 Code Quality: - Remove unused ColumnKey import from conversations-header - Add beforeEach test isolation in retention tests - Track per-project timing in retention cleanup - Remove unused _d1Db parameter from cleanupExpiredConversations - Add try-catch structured error logging in scheduled handler Co-Authored-By: Duyet Le <me@duyet.net> Co-Authored-By: duyetbot <bot@duyet.net> Co-Authored-By: Paperclip <noreply@paperclip.ing>
There was a problem hiding this comment.
Actionable comments posted: 16
♻️ Duplicate comments (1)
packages/api/src/scheduled.ts (1)
24-33:⚠️ Potential issue | 🟠 MajorRe-throw after logging the cron failure.
The catch block currently turns a failed cleanup into a successful scheduled run. That hides retention outages from Cron monitoring and makes retries/alerting much harder.
♻️ Minimal fix
} catch (err) { console.error( JSON.stringify({ level: "error", event: "retention_cleanup_failed", error: err instanceof Error ? err.message : String(err), trigger: event.cron, }), ); + throw err; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/api/src/scheduled.ts` around lines 24 - 33, The catch block in scheduled.ts is currently swallowing errors (logging a retention_cleanup_failed event) which makes the scheduled run appear successful; after logging the error details (the existing JSON.stringify block referencing event.cron and the retention_cleanup_failed event), re-throw the original error (throw err) so the failure surfaces to the scheduler/monitoring and enables retries/alerting; ensure you throw the same err object (not a new Error) to preserve stack/context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/api/src/lib/domain-verification.ts`:
- Around line 168-189: The regex-based meta-tag detection in
domain-verification.ts (variables pattern and reversedPattern) is fragile;
update the implementation to robustly handle arbitrary attribute order and
whitespace by either switching to an HTML parser (e.g., use a DOM/HTML parsing
library to query <meta name="agentstate-verification"> and read its content
attribute) or replace the two fragile regexes with single, more permissive
regexes that allow newlines and arbitrary attributes (use flags and patterns
that permit [\s\S] or the DOTALL/s flag and \b anchors so the content attribute
is captured regardless of ordering). Ensure the code that constructs the result
({ method: "meta_tag", success, error }) continues to run unchanged, only
changing how match/reversedMatch are produced (i.e., adjust
pattern/reversedPattern or replace with parser logic).
In `@packages/api/src/routes/v2/conversations/semantic.ts`:
- Around line 172-178: The code uses r.metadata.message_id directly when
checking and adding to the seen Set and when looking up in messageLookup,
causing inconsistency with the String() normalization used earlier; update the
three usages to normalize the id by calling String(r.metadata.message_id) in
seen.has(...), seen.add(...), and messageLookup.get(...) so the dedup logic
matches the behavior on line 148 and avoids mismatches between primitive types.
- Around line 49-84: The code inconsistently converts r.metadata.message_id to
String() when building messageIds for the DB query (messageIds) but uses the raw
value when populating/looking up in messageLookup and when checking/inserting
into seen; normalize the type to a single canonical form across all usages
(e.g., always String) by mapping results -> messageIds = results.map(r =>
String(r.metadata.message_id)), constructing messageLookup with keys as
String(m.id), and using String(r.metadata.message_id) when checking seen and
looking up messageLookup so comparisons (in messageIds, messageLookup, seen) are
consistent; update references to messageLookup, seen, and anywhere
r.metadata.message_id is used to use the same String conversion.
- Around line 13-18: The request body declares filter fields
(SemanticSearchRequest) but the router.post handler currently ignores them and
only uses c.get("projectId"); update the router.post handler to read body.filter
and apply its criteria: include project_id (falling back to c.get("projectId")
if absent), apply date_from/date_to as a metadata timestamp range, and require
tags (all or any per spec) when querying the vector store or by post-filtering
the returned results; modify the vector search call (or post-query loop that
produces results) to accept these metadata filters so date and tag constraints
are enforced rather than left unused.
In `@packages/api/src/services/bulk.ts`:
- Around line 19-41: In bulkDeleteConversations, add an early return when the
input ids array is empty to avoid executing the DB query and batch operations;
i.e., at the top of the function check if ids.length === 0 (or ids?.length ===
0) and return 0 immediately so the subsequent select(...).where(inArray(...,
ids)) and db.batch(...) are skipped.
- Around line 76-83: Update the JSDoc for the "Export conversations with their
messages." function to reflect that when ids is empty the function returns a
maximum of 100 most recent conversations (not "all"); explicitly state the limit
in the description and adjust the `@returns` or `@param` ids text to note "exports
up to 100 most recent conversations when ids is omitted or empty." Reference the
JSDoc block immediately above the export conversations function to make this
change.
- Around line 148-172: The map lookup msgsByConv.get(conv.id) is performed twice
inside the rows.map callback; cache it in a local variable (e.g., const msgs =
msgsByConv.get(conv.id) ?? []) at the top of the mapping for each conv and then
use msgs for both message_count and messages to avoid redundant lookups and make
the code in the rows.map callback (where conv, msgsByConv, deserializeMetadata
are used) cleaner.
In `@packages/api/src/services/embeddings.ts`:
- Around line 99-110: Update the mapping over result.matches to detect when
critical metadata fields (message_id, conversation_id, project_id, role,
created_at) are missing or fall back to defaults, and emit a warning including
match.id and which fields are missing; inside the same function that uses
result.matches.map and constructs TypedVectorMetadata, compute an array of
missingKeys by checking match.metadata?.message_id, ?.conversation_id, etc.,
then call the module’s logger (use the existing logger/processLogger in this
file) with logger.warn and a concise message listing match.id and missingKeys
before returning the mapped object so missing metadata is visible in logs
without changing the fallback behavior.
- Around line 42-43: The file defines an unused constant VECTORIZE_INDEX which
should be removed to avoid dead code; keep the EMBEDDING_MODEL declaration
(including its as unknown as keyof AiModels cast) but delete the VECTORIZE_INDEX
line and any related unused import/usage in this file so only EMBEDDING_MODEL
remains.
In `@packages/api/src/services/retention.ts`:
- Around line 44-97: The current loop over projectsWithRetention with an
unbounded per-project while(true) can starve later projects; modify the
retention loop (involving projectsWithRetention, TIME_BUDGET_MS, BATCH_SIZE, and
the inner deletion logic that calls db.batch and queries
conversations/messages/conversationTags) to enforce a per-project cap—e.g.,
limit max batches or max deleted rows per project (add a maxBatchesPerProject or
maxDeletesPerProject counter and break to the next project when reached) and
still respect the global TIME_BUDGET_MS; alternatively (or additionally) persist
a resume cursor (lastProcessedConversationId or lastProjectCursor) between runs
so subsequent executions continue where the previous run left off.
- Around line 25-27: cleanupExpiredConversations currently only deletes D1 rows
leaving Vectorize embeddings behind; update its signature to accept the vector
binding (e.g., add a parameter named vectorBinding or vectorStore) and, inside
cleanupExpiredConversations, collect the message row ids to be removed (these
are the same ids stored in Vectorize as row.id in messages.ts), call the vector
binding's delete method for those ids before executing the D1 deletions, and
then proceed to delete the DB rows; ensure you reference
cleanupExpiredConversations and the same row.id identifiers so vector removal
runs prior to the database cleanup.
- Around line 68-90: The deletion currently has a TOCTOU gap: after selecting
expired ids (expired / ids) a concurrent update can resurrect a conversation
before db.batch runs; update the three deletes in db.batch (conversationTags,
messages, conversations) to re-apply the expiry predicate so only conversations
still satisfying lt(conversations.updatedAt, cutoff) are removed — e.g.,
restrict deletions to ids that are both in ids and still present in a
subquery/select from conversations where eq(projectId, project.id) and
lt(updatedAt, cutoff); modify the where clauses for the
conversationTags/messages deletes to use that same filtered conversation-id set
so tags/messages for resurrected conversations are not deleted.
In
`@packages/dashboard/src/app/dashboard/project/data-tab/conversations-header.tsx`:
- Around line 35-48: ColumnPickerTrigger should link the button to the popup via
aria-controls and drop the unnecessary wrapper: update the ColumnPickerTrigger
signature to accept a unique popup id prop (e.g., controlsId or menuId) in
addition to show and onToggle, remove the surrounding <div
className="relative">, and set aria-controls={controlsId} on the Button while
keeping aria-expanded={show}, aria-haspopup="menu" and the existing handlers so
the button is programmatically associated with the controlled popup.
In `@packages/shared/src/index.ts`:
- Around line 305-314: The SemanticSearchRequest type declares filter fields
(date_from, date_to, tags) that aren’t used by the semantic search
implementation; either remove those fields from the interface or explicitly mark
them as reserved/unused with JSDoc so consumers know they’re not implemented.
Update the SemanticSearchRequest declaration in shared/src/index.ts (and any
usages) to either drop date_from/date_to/tags or add JSDoc comments like
"@reserved for future use — not used by semantic.ts" and keep semantic.ts’s
search functions unchanged; ensure TypeScript and tests compile after the
change.
- Around line 334-338: The ContextRetrievalRequest interface declares project_id
as required but the runtime in
packages/api/src/routes/v2/conversations/semantic.ts ignores this field and uses
the authenticated project's context; update the interface
(ContextRetrievalRequest) to make project_id optional (project_id?: string) or
add a clear JSDoc comment on the ContextRetrievalRequest type stating that
project_id is ignored when requests are authenticated and the project is derived
from the auth context so callers aren’t misled; ensure any related type usages
respect the optionality.
---
Duplicate comments:
In `@packages/api/src/scheduled.ts`:
- Around line 24-33: The catch block in scheduled.ts is currently swallowing
errors (logging a retention_cleanup_failed event) which makes the scheduled run
appear successful; after logging the error details (the existing JSON.stringify
block referencing event.cron and the retention_cleanup_failed event), re-throw
the original error (throw err) so the failure surfaces to the
scheduler/monitoring and enables retries/alerting; ensure you throw the same err
object (not a new Error) to preserve stack/context.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 82d44f15-73cb-4fec-a69d-72e6ee321a8e
📒 Files selected for processing (16)
packages/api/src/lib/domain-verification.tspackages/api/src/routes/v2/conversations/messages.tspackages/api/src/routes/v2/conversations/semantic.tspackages/api/src/routes/v2/projects/crud.tspackages/api/src/scheduled.tspackages/api/src/services/bulk.tspackages/api/src/services/domains.tspackages/api/src/services/embeddings.tspackages/api/src/services/retention.tspackages/api/src/services/v2-projects.tspackages/api/src/types.tspackages/api/test/retention.test.tspackages/api/wrangler.jsoncpackages/dashboard/src/app/dashboard/project/_retention-settings.tsxpackages/dashboard/src/app/dashboard/project/data-tab/conversations-header.tsxpackages/shared/src/index.ts
| const html = await response.text(); | ||
| // Match <meta name="agentstate-verification" content="..."> in either attribute order | ||
| const pattern = | ||
| /<meta\s+(?:[^>]*?\s)?name\s*=\s*["']agentstate-verification["']\s+(?:[^>]*?\s)?content\s*=\s*["']([^"']+)["'][^>]*>/i; | ||
| const match = html.match(pattern); | ||
|
|
||
| if (!match) { | ||
| // Try reversed attribute order: content before name | ||
| const reversedPattern = | ||
| /<meta\s+(?:[^>]*?\s)?content\s*=\s*["']([^"']+)["']\s+(?:[^>]*?\s)?name\s*=\s*["']agentstate-verification["'][^>]*>/i; | ||
| const reversedMatch = html.match(reversedPattern); | ||
|
|
||
| if (!reversedMatch) { | ||
| return { method: "meta_tag", success: false, error: "Meta tag not found" }; | ||
| } | ||
|
|
||
| const success = reversedMatch[1] === expectedToken; | ||
| return { method: "meta_tag", success, error: success ? undefined : "Token mismatch in meta tag" }; | ||
| } | ||
|
|
||
| const success = match[1] === expectedToken; | ||
| return { method: "meta_tag", success, error: success ? undefined : "Token mismatch in meta tag" }; |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Regex-based meta tag parsing is pragmatic but fragile.
The dual-pattern approach handles the common attribute orderings. However, HTML regex parsing has known edge cases:
- Extra attributes between
nameandcontent(e.g.,<meta id="x" name="agentstate-verification" class="y" content="...">) - Non-standard whitespace or newlines within the tag
The current pattern (?:[^>]*?\s)? is designed to skip intervening attributes but requires a trailing space, which may not match multi-attribute tags correctly.
This is acceptable for MVP since domain owners control their HTML, but a more robust approach would use a proper HTML parser or a more permissive regex.
💡 Optional: More permissive regex pattern
- const pattern =
- /<meta\s+(?:[^>]*?\s)?name\s*=\s*["']agentstate-verification["']\s+(?:[^>]*?\s)?content\s*=\s*["']([^"']+)["'][^>]*>/i;
+ // More permissive: allow any attributes between name and content
+ const pattern =
+ /<meta\s+[^>]*?name\s*=\s*["']agentstate-verification["'][^>]*?content\s*=\s*["']([^"']+)["'][^>]*>/i;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const html = await response.text(); | |
| // Match <meta name="agentstate-verification" content="..."> in either attribute order | |
| const pattern = | |
| /<meta\s+(?:[^>]*?\s)?name\s*=\s*["']agentstate-verification["']\s+(?:[^>]*?\s)?content\s*=\s*["']([^"']+)["'][^>]*>/i; | |
| const match = html.match(pattern); | |
| if (!match) { | |
| // Try reversed attribute order: content before name | |
| const reversedPattern = | |
| /<meta\s+(?:[^>]*?\s)?content\s*=\s*["']([^"']+)["']\s+(?:[^>]*?\s)?name\s*=\s*["']agentstate-verification["'][^>]*>/i; | |
| const reversedMatch = html.match(reversedPattern); | |
| if (!reversedMatch) { | |
| return { method: "meta_tag", success: false, error: "Meta tag not found" }; | |
| } | |
| const success = reversedMatch[1] === expectedToken; | |
| return { method: "meta_tag", success, error: success ? undefined : "Token mismatch in meta tag" }; | |
| } | |
| const success = match[1] === expectedToken; | |
| return { method: "meta_tag", success, error: success ? undefined : "Token mismatch in meta tag" }; | |
| const pattern = | |
| /<meta\s+(?:[^>]*?\s)?name\s*=\s*["']agentstate-verification["']\s+(?:[^>]*?\s)?content\s*=\s*["']([^"']+)["'][^>]*>/i; | |
| const match = html.match(pattern); | |
| if (!match) { | |
| // Try reversed attribute order: content before name | |
| // More permissive: allow any attributes between name and content | |
| const reversedPattern = | |
| /<meta\s+[^>]*?content\s*=\s*["']([^"']+)["'][^>]*?name\s*=\s*["']agentstate-verification["'][^>]*>/i; | |
| const reversedMatch = html.match(reversedPattern); | |
| if (!reversedMatch) { | |
| return { method: "meta_tag", success: false, error: "Meta tag not found" }; | |
| } | |
| const success = reversedMatch[1] === expectedToken; | |
| return { method: "meta_tag", success, error: success ? undefined : "Token mismatch in meta tag" }; | |
| } | |
| const success = match[1] === expectedToken; | |
| return { method: "meta_tag", success, error: success ? undefined : "Token mismatch in meta tag" }; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/src/lib/domain-verification.ts` around lines 168 - 189, The
regex-based meta-tag detection in domain-verification.ts (variables pattern and
reversedPattern) is fragile; update the implementation to robustly handle
arbitrary attribute order and whitespace by either switching to an HTML parser
(e.g., use a DOM/HTML parsing library to query <meta
name="agentstate-verification"> and read its content attribute) or replace the
two fragile regexes with single, more permissive regexes that allow newlines and
arbitrary attributes (use flags and patterns that permit [\s\S] or the DOTALL/s
flag and \b anchors so the content attribute is captured regardless of
ordering). Ensure the code that constructs the result ({ method: "meta_tag",
success, error }) continues to run unchanged, only changing how
match/reversedMatch are produced (i.e., adjust pattern/reversedPattern or
replace with parser logic).
| router.post("/", async (c) => { | ||
| const body = await c.req.json<{ | ||
| query?: string; | ||
| limit?: number; | ||
| filter?: { project_id?: string; date_from?: number; date_to?: number; tags?: string[] }; | ||
| }>(); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if filter fields are documented or used elsewhere
rg -n "date_from|date_to" packages/api/src/routes/v2/conversations/Repository: duyet/agentstate
Length of output: 208
🏁 Script executed:
# First, let's see the actual implementation to check if filter is used anywhere
cat -n packages/api/src/routes/v2/conversations/semantic.ts | head -50Repository: duyet/agentstate
Length of output: 2192
🏁 Script executed:
# Check if the filter object is destructured or used in the semantic.ts file
rg -n "filter" packages/api/src/routes/v2/conversations/semantic.ts -A 2 -B 2Repository: duyet/agentstate
Length of output: 651
🏁 Script executed:
# Verify the shared types definition
rg -n "SemanticSearchRequest\|ContextRetrievalRequest" packages/shared/src/index.ts -A 10Repository: duyet/agentstate
Length of output: 42
🏁 Script executed:
# Find and check shared types file
cat -n packages/shared/src/index.ts | grep -A 20 -B 2 "Semantic\|Conversation"Repository: duyet/agentstate
Length of output: 5588
🏁 Script executed:
# Also check the entire request body type structure
rg -n "project_id" packages/api/src/routes/v2/conversations/semantic.tsRepository: duyet/agentstate
Length of output: 306
🏁 Script executed:
# Verify if body.filter is used anywhere in the implementation
rg -n "body\.filter" packages/api/src/routes/v2/conversations/semantic.tsRepository: duyet/agentstate
Length of output: 42
🏁 Script executed:
# Check the full semantic.ts implementation to see complete flow
cat packages/api/src/routes/v2/conversations/semantic.tsRepository: duyet/agentstate
Length of output: 6718
Filter fields declared in request body are not used.
The request body type includes filter?: { project_id?: string; date_from?: number; date_to?: number; tags?: string[] }, matching SemanticSearchRequest in packages/shared/src/index.ts (lines 305-314). However, the implementation ignores these filters entirely—only c.get("projectId") from the auth context is used for filtering.
Either remove the unused filter fields from the shared types or implement the filtering logic (date range, tags) in the Vectorize query or post-query filtering.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/src/routes/v2/conversations/semantic.ts` around lines 13 - 18,
The request body declares filter fields (SemanticSearchRequest) but the
router.post handler currently ignores them and only uses c.get("projectId");
update the router.post handler to read body.filter and apply its criteria:
include project_id (falling back to c.get("projectId") if absent), apply
date_from/date_to as a metadata timestamp range, and require tags (all or any
per spec) when querying the vector store or by post-filtering the returned
results; modify the vector search call (or post-query loop that produces
results) to accept these metadata filters so date and tag constraints are
enforced rather than left unused.
| const messageIds = results.map((r) => String(r.metadata.message_id)); | ||
| const matchingMessages = await c | ||
| .get("db") | ||
| .select({ | ||
| id: messages.id, | ||
| conversationId: messages.conversationId, | ||
| role: messages.role, | ||
| content: messages.content, | ||
| createdAt: messages.createdAt, | ||
| }) | ||
| .from(messages) | ||
| .where(inArray(messages.id, messageIds)); | ||
|
|
||
| // Build a lookup for fast access | ||
| const messageLookup = new Map(matchingMessages.map((m) => [m.id, m])); | ||
|
|
||
| // Fetch conversation titles for context | ||
| const convIds = [...new Set(results.map((r) => String(r.metadata.conversation_id)))]; | ||
| const convRows = await c | ||
| .get("db") | ||
| .select({ id: conversations.id, title: conversations.title }) | ||
| .from(conversations) | ||
| .where(inArray(conversations.id, convIds)); | ||
| const titleLookup = new Map(convRows.map((c) => [c.id, c.title])); | ||
|
|
||
| // Merge vector results with DB data, deduplicate by message | ||
| const seen = new Set<string>(); | ||
| const searchResults = results | ||
| .filter((r) => { | ||
| if (seen.has(r.metadata.message_id)) return false; | ||
| seen.add(r.metadata.message_id); | ||
| return messageLookup.has(r.metadata.message_id); | ||
| }) | ||
| .slice(0, limit) | ||
| .map((r) => { | ||
| const msg = messageLookup.get(r.metadata.message_id)!; |
There was a problem hiding this comment.
Inconsistent String() conversion may cause lookup mismatches.
Line 49 wraps r.metadata.message_id with String() for the DB query, but lines 78-84 use r.metadata.message_id directly for the seen Set and messageLookup Map. If metadata values are stored as numbers in Vectorize, the Set/Map lookups will fail because "123" !== 123.
🔧 Proposed fix for consistent type handling
const seen = new Set<string>();
const searchResults = results
.filter((r) => {
- if (seen.has(r.metadata.message_id)) return false;
- seen.add(r.metadata.message_id);
- return messageLookup.has(r.metadata.message_id);
+ const msgId = String(r.metadata.message_id);
+ if (seen.has(msgId)) return false;
+ seen.add(msgId);
+ return messageLookup.has(msgId);
})
.slice(0, limit)
.map((r) => {
- const msg = messageLookup.get(r.metadata.message_id)!;
+ const msg = messageLookup.get(String(r.metadata.message_id))!;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const messageIds = results.map((r) => String(r.metadata.message_id)); | |
| const matchingMessages = await c | |
| .get("db") | |
| .select({ | |
| id: messages.id, | |
| conversationId: messages.conversationId, | |
| role: messages.role, | |
| content: messages.content, | |
| createdAt: messages.createdAt, | |
| }) | |
| .from(messages) | |
| .where(inArray(messages.id, messageIds)); | |
| // Build a lookup for fast access | |
| const messageLookup = new Map(matchingMessages.map((m) => [m.id, m])); | |
| // Fetch conversation titles for context | |
| const convIds = [...new Set(results.map((r) => String(r.metadata.conversation_id)))]; | |
| const convRows = await c | |
| .get("db") | |
| .select({ id: conversations.id, title: conversations.title }) | |
| .from(conversations) | |
| .where(inArray(conversations.id, convIds)); | |
| const titleLookup = new Map(convRows.map((c) => [c.id, c.title])); | |
| // Merge vector results with DB data, deduplicate by message | |
| const seen = new Set<string>(); | |
| const searchResults = results | |
| .filter((r) => { | |
| if (seen.has(r.metadata.message_id)) return false; | |
| seen.add(r.metadata.message_id); | |
| return messageLookup.has(r.metadata.message_id); | |
| }) | |
| .slice(0, limit) | |
| .map((r) => { | |
| const msg = messageLookup.get(r.metadata.message_id)!; | |
| const messageIds = results.map((r) => String(r.metadata.message_id)); | |
| const matchingMessages = await c | |
| .get("db") | |
| .select({ | |
| id: messages.id, | |
| conversationId: messages.conversationId, | |
| role: messages.role, | |
| content: messages.content, | |
| createdAt: messages.createdAt, | |
| }) | |
| .from(messages) | |
| .where(inArray(messages.id, messageIds)); | |
| // Build a lookup for fast access | |
| const messageLookup = new Map(matchingMessages.map((m) => [m.id, m])); | |
| // Fetch conversation titles for context | |
| const convIds = [...new Set(results.map((r) => String(r.metadata.conversation_id)))]; | |
| const convRows = await c | |
| .get("db") | |
| .select({ id: conversations.id, title: conversations.title }) | |
| .from(conversations) | |
| .where(inArray(conversations.id, convIds)); | |
| const titleLookup = new Map(convRows.map((c) => [c.id, c.title])); | |
| // Merge vector results with DB data, deduplicate by message | |
| const seen = new Set<string>(); | |
| const searchResults = results | |
| .filter((r) => { | |
| const msgId = String(r.metadata.message_id); | |
| if (seen.has(msgId)) return false; | |
| seen.add(msgId); | |
| return messageLookup.has(msgId); | |
| }) | |
| .slice(0, limit) | |
| .map((r) => { | |
| const msg = messageLookup.get(String(r.metadata.message_id))!; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/src/routes/v2/conversations/semantic.ts` around lines 49 - 84,
The code inconsistently converts r.metadata.message_id to String() when building
messageIds for the DB query (messageIds) but uses the raw value when
populating/looking up in messageLookup and when checking/inserting into seen;
normalize the type to a single canonical form across all usages (e.g., always
String) by mapping results -> messageIds = results.map(r =>
String(r.metadata.message_id)), constructing messageLookup with keys as
String(m.id), and using String(r.metadata.message_id) when checking seen and
looking up messageLookup so comparisons (in messageIds, messageLookup, seen) are
consistent; update references to messageLookup, seen, and anywhere
r.metadata.message_id is used to use the same String conversion.
| router.post("/context", async (c) => { | ||
| const body = await c.req.json<{ | ||
| query?: string; | ||
| max_tokens?: number; | ||
| project_id?: string; | ||
| }>(); |
There was a problem hiding this comment.
Request body project_id is ignored; shared type declares it as required.
ContextRetrievalRequest in packages/shared/src/index.ts (line 337) declares project_id: string as required, but the implementation uses c.get("projectId") from the authentication context instead. This creates a contract mismatch—clients may send project_id expecting it to be respected, but it's silently ignored.
Either:
- Remove
project_idfromContextRetrievalRequestand document that project scoping is automatic via auth, or - Use the body's
project_idfor cross-project queries (if that's the intent) with appropriate authorization checks.
| const seen = new Set<string>(); | ||
| for (const r of results) { | ||
| if (seen.has(r.metadata.message_id)) continue; | ||
| seen.add(r.metadata.message_id); | ||
|
|
||
| const msg = messageLookup.get(r.metadata.message_id); | ||
| if (!msg) continue; |
There was a problem hiding this comment.
Same String() inconsistency in /context endpoint.
Apply consistent String() conversion here as well to match line 148.
🔧 Proposed fix
const seen = new Set<string>();
for (const r of results) {
- if (seen.has(r.metadata.message_id)) continue;
- seen.add(r.metadata.message_id);
+ const msgId = String(r.metadata.message_id);
+ if (seen.has(msgId)) continue;
+ seen.add(msgId);
- const msg = messageLookup.get(r.metadata.message_id);
+ const msg = messageLookup.get(msgId);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const seen = new Set<string>(); | |
| for (const r of results) { | |
| if (seen.has(r.metadata.message_id)) continue; | |
| seen.add(r.metadata.message_id); | |
| const msg = messageLookup.get(r.metadata.message_id); | |
| if (!msg) continue; | |
| const seen = new Set<string>(); | |
| for (const r of results) { | |
| const msgId = String(r.metadata.message_id); | |
| if (seen.has(msgId)) continue; | |
| seen.add(msgId); | |
| const msg = messageLookup.get(msgId); | |
| if (!msg) continue; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/src/routes/v2/conversations/semantic.ts` around lines 172 - 178,
The code uses r.metadata.message_id directly when checking and adding to the
seen Set and when looking up in messageLookup, causing inconsistency with the
String() normalization used earlier; update the three usages to normalize the id
by calling String(r.metadata.message_id) in seen.has(...), seen.add(...), and
messageLookup.get(...) so the dedup logic matches the behavior on line 148 and
avoids mismatches between primitive types.
| for (const project of projectsWithRetention) { | ||
| const projectStart = Date.now(); | ||
| if (projectStart - start > TIME_BUDGET_MS) { | ||
| console.warn( | ||
| JSON.stringify({ | ||
| level: "warn", | ||
| event: "retention_time_budget_exceeded", | ||
| message: "Stopping retention cleanup — time budget exceeded", | ||
| elapsedMs: Date.now() - start, | ||
| }), | ||
| ); | ||
| break; | ||
| } | ||
|
|
||
| const retentionDays = project.retentionDays as number; | ||
| const cutoff = Date.now() - retentionDays * 86_400_000; | ||
| let totalDeleted = 0; | ||
| let batchCount = 0; | ||
|
|
||
| // eslint-disable-next-line no-constant-condition | ||
| while (true) { | ||
| if (Date.now() - start > TIME_BUDGET_MS) break; | ||
|
|
||
| // Fetch a batch of expired conversation IDs | ||
| const expired = await db | ||
| .select({ id: conversations.id }) | ||
| .from(conversations) | ||
| .where( | ||
| and( | ||
| eq(conversations.projectId, project.id), | ||
| lt(conversations.updatedAt, cutoff), | ||
| ), | ||
| ) | ||
| .limit(BATCH_SIZE); | ||
|
|
||
| if (expired.length === 0) break; | ||
|
|
||
| const ids = expired.map((r) => r.id); | ||
|
|
||
| // Batch delete: tags → messages → conversations (order respects FK constraints) | ||
| await db.batch([ | ||
| db | ||
| .delete(conversationTags) | ||
| .where(inArray(conversationTags.conversationId, ids)), | ||
| db.delete(messages).where(inArray(messages.conversationId, ids)), | ||
| db.delete(conversations).where(inArray(conversations.id, ids)), | ||
| ]); | ||
|
|
||
| totalDeleted += ids.length; | ||
| batchCount++; | ||
|
|
||
| // If we got fewer than BATCH_SIZE, this was the last batch | ||
| if (expired.length < BATCH_SIZE) break; | ||
| } |
There was a problem hiding this comment.
This loop can starve later projects indefinitely.
Each run starts from the first retention-enabled project and keeps deleting until that project is drained or the 25s budget is exhausted. If early projects keep producing enough expired rows, later projects may never be processed, so their retention policy is effectively unenforced. Cap work per project or persist a resume cursor between runs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/src/services/retention.ts` around lines 44 - 97, The current
loop over projectsWithRetention with an unbounded per-project while(true) can
starve later projects; modify the retention loop (involving
projectsWithRetention, TIME_BUDGET_MS, BATCH_SIZE, and the inner deletion logic
that calls db.batch and queries conversations/messages/conversationTags) to
enforce a per-project cap—e.g., limit max batches or max deleted rows per
project (add a maxBatchesPerProject or maxDeletesPerProject counter and break to
the next project when reached) and still respect the global TIME_BUDGET_MS;
alternatively (or additionally) persist a resume cursor
(lastProcessedConversationId or lastProjectCursor) between runs so subsequent
executions continue where the previous run left off.
| const expired = await db | ||
| .select({ id: conversations.id }) | ||
| .from(conversations) | ||
| .where( | ||
| and( | ||
| eq(conversations.projectId, project.id), | ||
| lt(conversations.updatedAt, cutoff), | ||
| ), | ||
| ) | ||
| .limit(BATCH_SIZE); | ||
|
|
||
| if (expired.length === 0) break; | ||
|
|
||
| const ids = expired.map((r) => r.id); | ||
|
|
||
| // Batch delete: tags → messages → conversations (order respects FK constraints) | ||
| await db.batch([ | ||
| db | ||
| .delete(conversationTags) | ||
| .where(inArray(conversationTags.conversationId, ids)), | ||
| db.delete(messages).where(inArray(messages.conversationId, ids)), | ||
| db.delete(conversations).where(inArray(conversations.id, ids)), | ||
| ]); |
There was a problem hiding this comment.
Re-check expiry when you delete.
Lines 68-77 select candidate ids, then Lines 84-89 delete by bare id. A concurrent append can bump conversations.updated_at after selection but before the batch runs, and this code will still delete that now-active conversation plus its messages/tags. Re-apply updated_at < cutoff in the delete subqueries, or make the selection+delete atomic, to close the TOCTOU gap.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/api/src/services/retention.ts` around lines 68 - 90, The deletion
currently has a TOCTOU gap: after selecting expired ids (expired / ids) a
concurrent update can resurrect a conversation before db.batch runs; update the
three deletes in db.batch (conversationTags, messages, conversations) to
re-apply the expiry predicate so only conversations still satisfying
lt(conversations.updatedAt, cutoff) are removed — e.g., restrict deletions to
ids that are both in ids and still present in a subquery/select from
conversations where eq(projectId, project.id) and lt(updatedAt, cutoff); modify
the where clauses for the conversationTags/messages deletes to use that same
filtered conversation-id set so tags/messages for resurrected conversations are
not deleted.
| function ColumnPickerTrigger({ show, onToggle }: ColumnPickerTriggerProps) { | ||
| return ( | ||
| <div className="relative"> | ||
| <Button | ||
| size="sm" | ||
| variant="outline" | ||
| type="button" | ||
| onClick={onToggle} | ||
| aria-expanded={show} | ||
| aria-haspopup="menu" | ||
| > | ||
| Columns | ||
| </Button> | ||
| </div> |
There was a problem hiding this comment.
Associate the trigger with the popup for better a11y (and drop redundant wrapper).
aria-expanded is present, but the button isn’t linked to the controlled popup via aria-controls. Also, the relative wrapper has no effect in this component.
Suggested patch
interface ColumnPickerTriggerProps {
show: boolean;
onToggle: () => void;
+ controlsId?: string;
}
@@
-function ColumnPickerTrigger({ show, onToggle }: ColumnPickerTriggerProps) {
+function ColumnPickerTrigger({ show, onToggle, controlsId }: ColumnPickerTriggerProps) {
return (
- <div className="relative">
- <Button
- size="sm"
- variant="outline"
- type="button"
- onClick={onToggle}
- aria-expanded={show}
- aria-haspopup="menu"
- >
- Columns
- </Button>
- </div>
+ <Button
+ size="sm"
+ variant="outline"
+ type="button"
+ onClick={onToggle}
+ aria-expanded={show}
+ aria-haspopup="menu"
+ aria-controls={controlsId}
+ >
+ Columns
+ </Button>
);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function ColumnPickerTrigger({ show, onToggle }: ColumnPickerTriggerProps) { | |
| return ( | |
| <div className="relative"> | |
| <Button | |
| size="sm" | |
| variant="outline" | |
| type="button" | |
| onClick={onToggle} | |
| aria-expanded={show} | |
| aria-haspopup="menu" | |
| > | |
| Columns | |
| </Button> | |
| </div> | |
| interface ColumnPickerTriggerProps { | |
| show: boolean; | |
| onToggle: () => void; | |
| controlsId?: string; | |
| } | |
| function ColumnPickerTrigger({ show, onToggle, controlsId }: ColumnPickerTriggerProps) { | |
| return ( | |
| <Button | |
| size="sm" | |
| variant="outline" | |
| type="button" | |
| onClick={onToggle} | |
| aria-expanded={show} | |
| aria-haspopup="menu" | |
| aria-controls={controlsId} | |
| > | |
| Columns | |
| </Button> | |
| ); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/dashboard/src/app/dashboard/project/data-tab/conversations-header.tsx`
around lines 35 - 48, ColumnPickerTrigger should link the button to the popup
via aria-controls and drop the unnecessary wrapper: update the
ColumnPickerTrigger signature to accept a unique popup id prop (e.g., controlsId
or menuId) in addition to show and onToggle, remove the surrounding <div
className="relative">, and set aria-controls={controlsId} on the Button while
keeping aria-expanded={show}, aria-haspopup="menu" and the existing handlers so
the button is programmatically associated with the controlled popup.
| export interface SemanticSearchRequest { | ||
| query: string; | ||
| limit?: number; | ||
| filter?: { | ||
| project_id?: string; | ||
| date_from?: number; | ||
| date_to?: number; | ||
| tags?: string[]; | ||
| }; | ||
| } |
There was a problem hiding this comment.
Filter fields declared but not implemented.
date_from, date_to, and tags filters are declared in the type but the implementation in semantic.ts doesn't use them. Consider removing unimplemented fields or marking them with JSDoc as "reserved for future use" to set proper expectations for API consumers.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/shared/src/index.ts` around lines 305 - 314, The
SemanticSearchRequest type declares filter fields (date_from, date_to, tags)
that aren’t used by the semantic search implementation; either remove those
fields from the interface or explicitly mark them as reserved/unused with JSDoc
so consumers know they’re not implemented. Update the SemanticSearchRequest
declaration in shared/src/index.ts (and any usages) to either drop
date_from/date_to/tags or add JSDoc comments like "@reserved for future use —
not used by semantic.ts" and keep semantic.ts’s search functions unchanged;
ensure TypeScript and tests compile after the change.
| export interface ContextRetrievalRequest { | ||
| query: string; | ||
| max_tokens?: number; | ||
| project_id: string; | ||
| } |
There was a problem hiding this comment.
project_id declared as required but ignored by implementation.
As noted in packages/api/src/routes/v2/conversations/semantic.ts, the implementation uses the authenticated project context rather than this request field. Either make project_id optional or document that it's ignored when authentication already scopes to a project.
🔧 Suggested fix to align with implementation
export interface ContextRetrievalRequest {
query: string;
max_tokens?: number;
- project_id: string;
+ project_id?: string; // Optional: defaults to authenticated project context
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export interface ContextRetrievalRequest { | |
| query: string; | |
| max_tokens?: number; | |
| project_id: string; | |
| } | |
| export interface ContextRetrievalRequest { | |
| query: string; | |
| max_tokens?: number; | |
| project_id?: string; // Optional: defaults to authenticated project context | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/shared/src/index.ts` around lines 334 - 338, The
ContextRetrievalRequest interface declares project_id as required but the
runtime in packages/api/src/routes/v2/conversations/semantic.ts ignores this
field and uses the authenticated project's context; update the interface
(ContextRetrievalRequest) to make project_id optional (project_id?: string) or
add a clear JSDoc comment on the ContextRetrievalRequest type stating that
project_id is ignored when requests are authenticated and the project is derived
from the auth context so callers aren’t misled; ensure any related type usages
respect the optionality.
Summary
model,input_tokens,output_tokens,cost_microdollars) on messages, aggregated totals on conversations, V2 API support, analytics breakdownsretention_dayssetting with scheduled cleanup via Cloudflare Cron TriggerChanges (50 files, +1438/-427)
Schema & Migrations
0005_token_cost_tracking.sql— Message token/cost fields, conversation aggregates0006_retention_days.sql— Project retention_days columnAPI
services/bulk.tsDashboard
Infra
Test plan
bunx biome check packages/api/src/— lint cleanbunx tsc --noEmit -p packages/api/tsconfig.json— types cleancd packages/api && bunx vitest run— all tests passcd packages/dashboard && bun run build— dashboard builds🤖 Generated with Claude Code
Co-Authored-By: Duyet Le me@duyet.net
Co-Authored-By: duyetbot bot@duyet.net
Co-Authored-By: Paperclip noreply@paperclip.ing
Summary by CodeRabbit
New Features
Tests