fix(orchestrator): prefer live integrations over memory_tree for inbox/doc queries by senamakel · Pull Request #1731 · tinyhumansai/openhuman

senamakel · 2026-05-14T12:01:57Z

Summary

Orchestrator was answering "any new emails?" / "summarise the notion doc I edited" from the retrospective memory_tree index instead of calling the live integration. Users had to explicitly say "search my gmail" to force the right behaviour.
Reworked the orchestrator prompt to disambiguate delegate_to_integrations_agent (live SaaS) from memory_tree (historical ingest), with an explicit decision-tree step that catches inbox/doc/calendar/chat phrasings before memory tools are considered.
No tool-surface changes — prompt-only fix, plus a test update for the new decision-tree wording.

Problem

Users complained they had to spell out "search my gmail" or "check my inbox" to get the orchestrator to actually hit the Gmail integration. Otherwise it would happily answer from memory_tree — which is a stale, retrospective summary of already-ingested content, not a live view of the inbox.

Root cause in the prompt:

The "Delegation Decision Tree" listed memory_* under "direct tools" in Step 2, before the "specialised execution" step that mentions delegate_to_integrations_agent. Combined with the bias toward direct tools, the model preferred memory.
The "Memory tree retrieval" section described memory_tree as a tool to "query the user's ingested email/chat/document history" with a mode literally documented as "Use for 'in my email last week…' intents" — exactly the phrasing users were tripping on.
The Connected Integrations and Response Style example sections never gave a concrete live-integration delegation pattern to imitate.

Solution

src/openhuman/agent/agents/orchestrator/prompt.md:

Inserted a new Step 2 in the decision tree that explicitly routes any request naming or implying a connected external service (email/inbox/gmail/calendar/notion/drive/slack/whatsapp/telegram/linear/etc.) to delegate_to_integrations_agent before memory or other direct tools are considered. Explicit guidance: "Do this even if memory_tree could plausibly answer."
If the relevant toolkit is not in Connected Integrations, the orchestrator must point the user at Settings → Connections → [Service] rather than silently falling back to memory_tree.
Reframed the Memory tree retrieval section as historical-only. It is now described as a retrospective index over already-ingested content, not a live API. Listed the phrasings that genuinely warrant it ("what did we discuss last month", "summarise my recent activity"). Removed misleading language about "in my email last week…".
Added a concrete Gmail example to the Response Style section so the model has a delegation pattern to imitate, and annotated the existing Notion example as a live delegation.

src/openhuman/agent/agents/orchestrator/prompt.rs:

Updated build_includes_direct_first_decision_tree to assert the new decision-tree wording (live-integration step + the "do this even if memory_tree could plausibly answer" guidance) so a regression that removed Step 2 would fail the build.

Submission Checklist

Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
N/A: prompt-only change with no executable diff coverage signal; the existing orchestrator::prompt::tests suite was extended to assert the new decision-tree wording.
N/A: behaviour-only change — no new/removed/renamed feature rows in docs/TEST-COVERAGE-MATRIX.md.
N/A: no matrix feature IDs involved.
No new external network dependencies introduced (mock backend used per Testing Strategy)
N/A: no release-cut surfaces touched (orchestrator prompt only).
N/A: no linked issue — user-reported behaviour, fixed directly.

Impact

Runtime: desktop only (orchestrator agent prompt). No platform-specific code paths touched.
Behaviour: users asking about inbox / docs / calendar / chat state in natural language ("any new emails from alice today?", "summarise the latest notion doc") will now get a live integration call instead of a stale memory summary.
Compatibility: no schema or tool-surface change. memory_tree still available for retrospective queries.
Risk: prompt regressions are caught by the updated unit tests in prompt.rs. Worst-case behaviour mismatch is recoverable via a follow-up prompt tweak — no migrations or persisted state involved.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A

Commit & Branch

Branch: fix/orchestrator-prefer-live-integrations
Commit SHA: 74c2b0d

Validation Run

pnpm --filter openhuman-app format:check — N/A: no app/ files touched.
pnpm typecheck — N/A: no TS files touched.
Focused tests: cargo test --manifest-path Cargo.toml --lib orchestrator::prompt — 7 passed, 0 failed.
Rust fmt/check (if changed): orchestrator prompt module compiles via the focused test run above.
Tauri fmt/check (if changed): N/A — no app/src-tauri changes.

Validation Blocked

command: N/A
error: N/A
impact: N/A

Behavior Changes

Intended behavior change: orchestrator routes live inbox/doc/calendar/chat requests to delegate_to_integrations_agent instead of memory_tree.
User-visible effect: "any new emails?" hits Gmail; "what did we discuss last month" still hits memory.

Parity Contract

Legacy behavior preserved: memory_tree still callable for retrospective queries; all existing modes (search_entities, query_topic, query_source, query_global, drill_down, fetch_leaves) unchanged.
Guard/fallback/dispatch parity checks: orchestrator tool registry (agent.toml) untouched; agents::loader invariant "orchestrator must have memory_tree" still holds.

Duplicate / Superseded PR Handling

Duplicate PR(s): N/A
Canonical PR: N/A
Resolution: N/A

Summary by CodeRabbit

New Features
- Improved orchestrator agent routing to prioritize live external services and explicit integration delegation; examples and guidance updated.
Chores
- Bumped core dependency versions and adjusted a dependency feature for compatibility.
- Mock API delay handling now enforces a configurable maximum cap to prevent excessive simulated delays.

…dates Bumps the npm_and_yarn group with 1 update in the / directory: [basic-ftp](https://github.com/patrickjuchli/basic-ftp). Updates `basic-ftp` from 5.3.0 to 5.3.1 - [Release notes](https://github.com/patrickjuchli/basic-ftp/releases) - [Changelog](https://github.com/patrickjuchli/basic-ftp/blob/master/CHANGELOG.md) - [Commits](patrickjuchli/basic-ftp@v5.3.0...v5.3.1) Updates `fast-xml-builder` from 1.1.5 to 1.2.0 - [Changelog](https://github.com/NaturalIntelligence/fast-xml-builder/blob/main/CHANGELOG.md) - [Commits](NaturalIntelligence/fast-xml-builder@v1.1.5...v1.2.0) Updates `ip-address` from 10.1.0 to 10.2.0 - [Commits](https://github.com/beaugunderson/ip-address/commits) --- updated-dependencies: - dependency-name: basic-ftp dependency-version: 5.3.1 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: fast-xml-builder dependency-version: 1.2.0 dependency-type: indirect dependency-group: npm_and_yarn - dependency-name: ip-address dependency-version: 10.2.0 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com>

Bumps the cargo group with 2 updates in the / directory: [openssl](https://github.com/rust-openssl/rust-openssl) and [rustls-webpki](https://github.com/rustls/webpki). Updates `openssl` from 0.10.77 to 0.10.79 - [Release notes](https://github.com/rust-openssl/rust-openssl/releases) - [Commits](rust-openssl/rust-openssl@openssl-v0.10.77...openssl-v0.10.79) Updates `rustls-webpki` from 0.103.12 to 0.103.13 - [Release notes](https://github.com/rustls/webpki/releases) - [Commits](rustls/webpki@v/0.103.12...v/0.103.13) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.79 dependency-type: indirect dependency-group: cargo - dependency-name: rustls-webpki dependency-version: 0.103.13 dependency-type: indirect dependency-group: cargo ... Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

…tofix-5 Potential fix for code scanning alert no. 5: Resource exhaustion

…arn/npm_and_yarn-9401a92e25 build(deps): bump the npm_and_yarn group across 1 directory with 3 updates

…go-83c3bdb6f7 build(deps): bump the cargo group across 1 directory with 2 updates

…ck delay cap Addresses CI failures and reviewer feedback on PR tinyhumansai#1462: - fix(rand): update rand API calls to 0.10 compat - src/core/auth.rs: RngCore::fill_bytes → RngExt::fill - src/openhuman/security/pairing.rs: same - src/openhuman/memory/tree/tree_source/registry.rs: thread_rng().gen() → rand::random() - src/openhuman/tools/impl/computer/human_path.rs: Rng → RngExt bound, fix float type ambiguity - fix(sentry): remove "test" feature from production sentry dep (Cargo.toml:113) - fix(mock-api): add MAX_MOCK_DELAY_MS=30_000 cap in getDelayMs() (scripts/mock-api/state.mjs) - fix(mock-api): add re-export shim at scripts/mock-api-core.mjs for backward compat - chore: resolve merge conflicts in Cargo.lock, pnpm-lock.yaml, app/src-tauri/Cargo.lock

The upstream repo has CodeQL default setup enabled, which rejects SARIF uploads from advanced configurations with: "CodeQL analyses from advanced configurations cannot be processed when the default setup is enabled". Removing the workflow restores green CI; default setup already provides the equivalent scanning at the repo level.

Per CodeRabbit suggestion + CLAUDE.md "Debug logging" policy: emit entry/exit `log::trace!` markers around the token generation flow with a stable `[auth]` prefix. No secret material is logged — only counts.

…x/doc queries Users were having to explicitly tell the model to "search my gmail" because the orchestrator was answering inbox/doc/calendar queries from the retrospective memory_tree index instead of delegating to the live integration. The prompt had two overlapping sections (Connected Integrations + Memory tree retrieval) and the decision tree didn't disambiguate them. - Insert a new Step 2 in the decision tree that routes any request naming or implying a connected service (inbox/gmail/calendar/notion/ drive/slack/etc.) to delegate_to_integrations_agent BEFORE memory or other direct tools are considered. Explicit "do this even if memory_tree could plausibly answer." - Reframe the memory_tree section as historical-only: it's a retrospective index over ingested content, not a live API. Listed the phrasings that actually warrant it ("what did we discuss last month", "summarise recent activity"). - Added a gmail example to Response Style so the model has a concrete delegate_to_integrations_agent pattern to imitate, and noted on the existing notion example that it's a live delegation. - Updated the decision-tree test in prompt.rs to assert the new wording.

coderabbitai · 2026-05-14T12:02:10Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 840ab314-ab83-44f0-b3c6-5890cd09e8a7

📥 Commits

Reviewing files that changed from the base of the PR and between 74c2b0d and 6a27575.

📒 Files selected for processing (1)

src/openhuman/agent/agents/orchestrator/prompt.md

📝 Walkthrough

Walkthrough

Upgrade rand to 0.10 and modernize RNG APIs; add a 30s cap to mock API delays; update orchestrator prompt and tests to delegate live external-service requests to the integrations agent instead of memory_tree.

Changes

Rand 0.10 Upgrade & RNG API Modernization

Layer / File(s)	Summary
Dependency version and feature updates `Cargo.toml`	`rand` bumped from 0.9 to 0.10; removed `test` feature from main `sentry` dependency.
Security token generation modernization `src/core/auth.rs`, `src/openhuman/security/pairing.rs`	Token generators switch from `RngCore::fill_bytes()` to `RngExt::fill()`; `auth.rs` adds trace-level logging around token generation.
RNG trait updates in utility and sampling modules `src/openhuman/memory/tree/tree_source/registry.rs`, `src/openhuman/tools/impl/computer/human_path.rs`	`new_summary_id` uses `rand::random()`; `human_path()`, `dwell_ms()`, and `sample_normal()` now require `RngExt` instead of `Rng`.

Mock API Delay Clamping

Layer / File(s)	Summary
Delay maximum and validation logic `scripts/mock-api/state.mjs`	Added `MAX_MOCK_DELAY_MS = 30_000`; `getDelayMs()` validates numeric input and clamps positive finite delays to the maximum.

Orchestrator Agent Delegation Logic

Layer / File(s)	Summary
Delegation decision tree and memory tree clarification `src/openhuman/agent/agents/orchestrator/prompt.md`	Decision tree now detects service-implied requests, looks up connected toolkits, and delegates to `delegate_to_integrations_agent`; `memory_tree` limited to retrospective/ingested data and not used as live source-of-truth.
Prompt test assertion updates `src/openhuman/agent/agents/orchestrator/prompt.rs`	Tests updated to match revised decision-tree wording and to assert routing of connected external-service requests to `delegate_to_integrations_agent`.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant Orchestrator
  participant ConnectedIntegrations as IntegrationsAgent
  participant MemoryTree

  User->>Orchestrator: request implying live external service (e.g., "check my Gmail")
  Orchestrator->>Orchestrator: detect service-implied request
  Orchestrator->>IntegrationsAgent: lookup toolkit & delegate_to_integrations_agent
  IntegrationsAgent-->>Orchestrator: toolkit response / action result
  Orchestrator-->>User: return live-service result
  Note over Orchestrator,MemoryTree: MemoryTree only used for retrospective/ingested context or if live toolkit is missing (explicitly notify user)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

tinyhumansai/openhuman#1488: Overlapping orchestrator delegation logic changes that earlier moved routing toward delegate_to_integrations_agent.
tinyhumansai/openhuman#1309: Related work touching human_path sampling logic that this PR updates from Rng to RngExt.

Poem

🐰 I nibble bytes and seeds anew,

rand refreshed, the traits accrue.
Delays trimmed short — thirty grand,
Agents now send calls by hand,
A rabbit cheers: delegation true!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and accurately summarizes the primary change: updating the orchestrator prompt to prefer delegating inbox/doc/calendar queries to live integrations instead of relying on the retrospective memory_tree index.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

src/openhuman/agent/agents/orchestrator/prompt.rs (1)
163-167: ⚡ Quick win

Add one assertion for the unconnected-integration branch.

These tests cover the live-routing branch well, but they don’t lock in the “Settings → Connections → [Service]” / “do not silently fall back” instruction. Adding a contains(...) assertion would guard the exact regression this PR is fixing.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/agents/orchestrator/prompt.rs` around lines 163 - 167,
The test is missing an assertion that locks in the unconnected-integration
branch text, so add an assertion alongside the existing ones that checks the
rendered prompt body for the “Settings → Connections → [Service]” / “do not
silently fall back” instruction; specifically, in the same test where `body` is
asserted to contain the live-routing lines (references: `body`,
`delegate_to_integrations_agent`, `memory_tree`), add something like
assert!(body.contains("Settings → Connections → [Service]") or
assert!(body.contains("do not silently fall back")) to ensure the
unconnected-integration guidance is present.
src/openhuman/tools/impl/computer/human_path.rs (1)
122-122: ⚡ Quick win

Remove duplicated trait bound RngExt + RngExt at line 122.

The generic bound lists RngExt twice, which is redundant. Simplify to a single bound for clearer intent and to avoid lint noise.
Proposed fix
-fn sample_normal<R: RngExt + RngExt>(mean: f64, stddev: f64, rng: &mut R) -> f64 {
+fn sample_normal<R: RngExt>(mean: f64, stddev: f64, rng: &mut R) -> f64 {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/tools/impl/computer/human_path.rs` at line 122, The generic
bound on function sample_normal incorrectly repeats RngExt twice (R: RngExt +
RngExt); update the signature to use a single trait bound (e.g., R: RngExt) so
the function declaration reads with one RngExt bound while keeping the rng
parameter type (&mut R) unchanged; locate sample_normal and remove the duplicate
RngExt in the generic bounds.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/agent/agents/orchestrator/prompt.md`:
- Around line 117-124: The example in the orchestrator prompt uses em dashes
(e.g., in the inline response lines containing "— wants to grab food" and
similar) which contradicts the prompt rule banning em dashes; update the example
text used with the delegate_to_integrations_agent/toolkit: "gmail" snippet to
replace em dashes with allowed punctuation (e.g., a hyphen '-' or a colon ':')
and ensure the revised lines preserve meaning and tone while conforming to the
no-em-dash style rule.

---

Nitpick comments:
In `@src/openhuman/agent/agents/orchestrator/prompt.rs`:
- Around line 163-167: The test is missing an assertion that locks in the
unconnected-integration branch text, so add an assertion alongside the existing
ones that checks the rendered prompt body for the “Settings → Connections →
[Service]” / “do not silently fall back” instruction; specifically, in the same
test where `body` is asserted to contain the live-routing lines (references:
`body`, `delegate_to_integrations_agent`, `memory_tree`), add something like
assert!(body.contains("Settings → Connections → [Service]") or
assert!(body.contains("do not silently fall back")) to ensure the
unconnected-integration guidance is present.

In `@src/openhuman/tools/impl/computer/human_path.rs`:
- Line 122: The generic bound on function sample_normal incorrectly repeats
RngExt twice (R: RngExt + RngExt); update the signature to use a single trait
bound (e.g., R: RngExt) so the function declaration reads with one RngExt bound
while keeping the rng parameter type (&mut R) unchanged; locate sample_normal
and remove the duplicate RngExt in the generic bounds.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 599d7ca9-39f7-4f17-9618-b4427595a684

📥 Commits

Reviewing files that changed from the base of the PR and between 23fbaec and 74c2b0d.

⛔ Files ignored due to path filters (2)

Cargo.lock is excluded by !**/*.lock
app/src-tauri/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (8)

Cargo.toml
scripts/mock-api/state.mjs
src/core/auth.rs
src/openhuman/agent/agents/orchestrator/prompt.md
src/openhuman/agent/agents/orchestrator/prompt.rs
src/openhuman/memory/tree/tree_source/registry.rs
src/openhuman/security/pairing.rs
src/openhuman/tools/impl/computer/human_path.rs

Resolves a conflict in `orchestrator/prompt.md` where upstream's tinyhumansai#1731 added a new step-2 "live external service" branch and renumbered the decision tree (3 → 4). Keeps both upstream's integrations-first ordering and the crypto branch from this PR under the renumbered step 4 ("Does this need other specialised execution?").

rafaelfiguereod-stack and others added 11 commits May 10, 2026 14:39

Create codeql.yml

525cba2

Potential fix for code scanning alert no. 5: Resource exhaustion

4ae2e1b

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

Merge pull request tinyhumansai#3 from rafaelfiguereod-stack/alert-au…

5c5624f

…tofix-5 Potential fix for code scanning alert no. 5: Resource exhaustion

Merge pull request #1 from rafaelfiguereod-stack/dependabot/npm_and_y…

b3d61e4

…arn/npm_and_yarn-9401a92e25 build(deps): bump the npm_and_yarn group across 1 directory with 3 updates

Merge pull request #2 from rafaelfiguereod-stack/dependabot/cargo/car…

e8d1388

…go-83c3bdb6f7 build(deps): bump the cargo group across 1 directory with 2 updates

fix(auth): add trace diagnostics to generate_token

07c25fc

Per CodeRabbit suggestion + CLAUDE.md "Debug logging" policy: emit entry/exit `log::trace!` markers around the token generation flow with a stable `[auth]` prefix. No secret material is logged — only counts.

senamakel requested a review from a team May 14, 2026 12:01

coderabbitai Bot requested changes May 14, 2026

View reviewed changes

Comment thread src/openhuman/agent/agents/orchestrator/prompt.md Outdated

fix(orchestrator): drop em-dashes from prompt.md examples per coderabbit

6a27575

coderabbitai Bot approved these changes May 14, 2026

View reviewed changes

senamakel merged commit 4d5e0d5 into tinyhumansai:main May 14, 2026
26 of 27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(orchestrator): prefer live integrations over memory_tree for inbox/doc queries#1731

fix(orchestrator): prefer live integrations over memory_tree for inbox/doc queries#1731
senamakel merged 12 commits into
tinyhumansai:mainfrom
senamakel:fix/orchestrator-prefer-live-integrations

senamakel commented May 14, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

senamakel commented May 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Validation Run

Validation Blocked

Behavior Changes

Parity Contract

Duplicate / Superseded PR Handling

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

senamakel commented May 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 14, 2026 •

edited

Loading