feat(agent): data-grounding rule in the agent system prompt (never fabricate) by sweetmantech · Pull Request #736 · recoupable/api

sweetmantech · 2026-07-01T16:13:51Z

Part of recoupable/chat#1833 — the PRIMARY item (single biggest lever). Base: main.

Why

Across every hallucinated task email, the root cause is the agent's decision rule "no data ⇒ invent plausible data" — stated verbatim on the Apache→OneRPM run (chat c38d62cd, ord 30): "Since the API doesn't have direct CPM metrics, I'll generate a professional report with sample YouTube analytics data." It's the only failure common to all of them, and the only fix that caps hallucinated data at ~0 regardless of the data-access bugs (some metrics — e.g. YouTube CPM — have no connector, so no access fix can conjure them).

Fix

buildAgentSystemPrompt (used by runAgentStep) now always emits a DATA_GROUNDING_SECTION, first:

State only figures you retrieved from a successful tool call this run. If a data call fails/returns empty/isn't connected, say so and omit the metric (shorter honest report, or stop) — never estimate, use "industry averages", or sample/placeholder numbers.

Tests (TDD)

RED→GREEN: buildAgentSystemPrompt always includes the no-fabrication rule (even with empty options); updated the exact-output tests.
lib/chat + app/lib/workflows suites green (568); tsc + eslint clean.

Accepted tradeoff: until the enabler PRs land, some reports get thinner ("no data") — accurate-but-thin beats confident-but-fabricated. The enablers (#733 skill install, #734 artists id, #735 socials id, LinkedIn, persistence) restore real data.

🤖 Generated with Claude Code

Summary by cubic

Adds a data-grounding rule to the agent system prompt so the agent never fabricates metrics or facts. The prompt now always starts with a “never fabricate” instruction that allows only data from successful tool calls; otherwise, say “no data.” Addresses recoupable/chat#1833.

New Features
- buildAgentSystemPrompt always prepends a no-fabrication data-grounding section.
- Instructs agents to omit metrics when data calls fail/are empty/not connected; no estimates, “industry averages,” samples, or placeholders.

^{Written for commit 58685e3. Summary will update on new commits.}

…bricate) The single biggest lever against hallucinated task-email data (recoupable/chat#1833): the root cause across every fabricated report is the agent's rule "no data ⇒ invent plausible data" (Apache→OneRPM run, verbatim: "the API doesn't have direct CPM metrics, I'll generate … sample data"). buildAgentSystemPrompt now always emits a DATA_GROUNDING_SECTION: state only figures retrieved from a successful tool call this run; on missing/failed/empty data, say so and omit/stop — never estimate, "industry average", or sample. This caps hallucinated data at ~0 for all tasks, even ones with no data source. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vercel · 2026-07-01T16:13:52Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
api	Ready	Preview	Jul 1, 2026 4:15pm

coderabbitai · 2026-07-01T16:14:01Z

Warning

Review limit reached

@sweetmantech, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 34 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6c60a2ac-2333-49bc-9121-0c16b29f8b9e

📥 Commits

Reviewing files that changed from the base of the PR and between 148a740 and 58685e3.

⛔ Files ignored due to path filters (1)

lib/chat/__tests__/buildAgentSystemPrompt.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**

📒 Files selected for processing (1)

lib/chat/buildAgentSystemPrompt.ts

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/runstep-grounding-rule

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

cubic-dev-ai

No issues found across 2 files

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

Architecture diagram

sequenceDiagram
    participant UI as Chat UI
    participant API as Chat API Route
    participant Agent as Agent Runtime (runAgentStep)
    participant Prompt as buildAgentSystemPrompt
    participant Tools as Tool Executor
    participant External as External API / Data Sources

    Note over UI,External: Agent Run Flow with Data-Grounding Rule

    UI->>API: POST /chat (user message)
    API->>Agent: runAgentStep(session, options)

    Agent->>Prompt: buildAgentSystemPrompt({cwd, customInstructions})
    Prompt->>Prompt: Prepend DATA_GROUNDING_SECTION (never fabricate)
    Prompt-->>Agent: Full system prompt string

    Agent->>Agent: Compose messages (system + history + user)
    Agent->>Agent: LLM call with composed messages

    alt LLM decides to call a tool
        Agent->>Tools: Execute tool (e.g., getYouTubeMetrics)
        Tools->>External: Fetch data from source
        alt Data returned successfully
            External-->>Tools: Valid data
            Tools-->>Agent: Tool result (non-empty)
            Agent->>Agent: Include data in response
        else Data fails / empty / not connected
            External-->>Tools: Error or empty response
            Tools-->>Agent: Tool result (empty/error)
            Note over Agent: NEW: Per system prompt rule, omit metric
            Agent->>Agent: Say "no data connected" / omit metric
        end
    else LLM decides to answer from knowledge
        Note over Agent: NEW: Cannot fabricate data without tool call
        Agent->>Agent: Omit unsourced metrics or state "no data"
    end

    Agent-->>API: Agent response (text + tool results)
    API-->>UI: Streamed response

    Note over Agent: The DATA_GROUNDING_SECTION instructs the agent to never estimate,<br/>use industry averages, or invent sample/placeholder numbers.<br/>Only data from a successful tool call this run is valid.

_{Requires human review: Core agent prompt change that alters agent behavior across all runs; potential unintended consequences require human review.

Re-trigger cubic}

sweetmantech · 2026-07-01T16:45:08Z

Preview verification — replayed the exact OneRPM prompt that fabricated

Ran the real customer prompt that produced the hallucinated "Apache YouTube CPM Weekly Report" (scheduled_actions 70956e7e, account 94d3f7e5 → OneRPM) against this PR's preview, to check the grounding rule stops the fabrication.

Setup (validatable):

Preview: https://api-ispsxcanz-recoup.vercel.app — deployment 5272164460, built from this PR's head commit 58685e3c.
Started via POST /api/chat/runs → runId=wrun_01KWF8SXX2GJ766CVCAZZB3DD2, chatId=31ebb429-6792-438b-b41a-b7d6e9027fb7.
Model: default anthropic/claude-haiku-4.5 (no override — i.e. the weakest/most-likely-to-cut-corners case).
Prompt: the scheduled action verbatim, with the only change being the recipient rewritten to sweetmantech@gmail.com (there is no recipient sanitization on /api/emails — this swap is the sole safeguard against emailing the real customer). Artist ebae4bb9 (Apache) under a different account, which reproduces the original's unavailable-CPM-data condition.

Result — no fabrication, no send. Verified against chat_messages + email_send_log for chat_id=31ebb429-…:

check	this run (with grounding rule)	original prod run (chat `c38d62cd`, 2026-07-01)
CPM/CTR/revenue numbers invented	none	fabricated a full report
report HTML written (`tool-write`)	0	1 (`apache_cpm_report.html`)
emails sent (`email_send_log` rows for chat)	0	1 — delivered to `stephanie.guerrero@onerpm.com`
how it ended	recognized data unavailable → 3× `tool-ask_user_question` asking for the YouTube connection	wrote "PROJECT DELIVERY VERIFIED ✅"

The agent's own words (from the run's chat_messages parts):

(part 19) "the task is asking for YouTube CPM analysis data, which would typically require YouTube Analytics API access through Recoup. Let me check what data is actually available"
(part 35) "to generate a CPM analysis report for Apache's YouTube channel, I would need: 1) YouTube connected to the Recoup account … 2) Apache as an artist … with YouTube channel linked … 3) Historical CPM, video performance, and audience data. Let me clarify what's actually available and ask if you have the necessary YouTube connection"

It then called ask_user_question instead of inventing numbers. The original run, given the identical failing data access, fabricated CPM/CTR/revenue and emailed it.

To reproduce / validate: inspect the run's trace with
select p->>'type', p->>'text' from chat_messages cm, lateral jsonb_array_elements(cm.parts::jsonb->'parts') t(p) where cm.chat_id='31ebb429-6792-438b-b41a-b7d6e9027fb7';
and select count(*) from email_send_log where chat_id='31ebb429-6792-438b-b41a-b7d6e9027fb7'; → 0.

Two honest caveats (not blockers for this PR):

For a headless scheduled task there's no user to answer ask_user_question, so the ideal terminal behavior is to send a short honest "no YouTube CPM data connected" email rather than ask — a follow-up on top of this rule. The critical win here is that it did not fabricate and did not email false data.
The run ended in an ask_user_question retry loop (a header-length hiccup in that tool, unrelated to this change) — noting it for transparency; the grounding behavior is unaffected.

Tested on preview, 2026-07-01.

…731 email_send_log, #730 helper, #736 grounding, #739 no-ask-user)

vercel Bot deployed to Preview July 1, 2026 16:15 View deployment

cubic-dev-ai Bot reviewed Jul 1, 2026

View reviewed changes

sweetmantech mentioned this pull request Jul 1, 2026

Task emails must contain real, sourced data — eliminate hallucinated metrics recoupable/chat#1833

Closed

26 tasks

sweetmantech merged commit 57846cb into main Jul 1, 2026
6 checks passed

sweetmantech added a commit that referenced this pull request Jul 1, 2026

Merge origin/main into test (sync direct-to-main merges: #729 guard, #…

cd74805

…731 email_send_log, #730 helper, #736 grounding, #739 no-ask-user)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(agent): data-grounding rule in the agent system prompt (never fabricate)#736

feat(agent): data-grounding rule in the agent system prompt (never fabricate)#736
sweetmantech merged 1 commit into
mainfrom
feat/runstep-grounding-rule

sweetmantech commented Jul 1, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

vercel Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jul 1, 2026

Review limit reached

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

sweetmantech commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

sweetmantech commented Jul 1, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Fix

Tests (TDD)

Summary by cubic

Uh oh!

vercel Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jul 1, 2026

Review limit reached

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sweetmantech commented Jul 1, 2026

Preview verification — replayed the exact OneRPM prompt that fabricated

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sweetmantech commented Jul 1, 2026 •

edited by cubic-dev-ai Bot

Loading

vercel Bot commented Jul 1, 2026 •

edited

Loading