Skip to content

feat(orchestrator): expose update_check + update_apply tools (#1435)#1473

Merged
senamakel merged 6 commits into
tinyhumansai:mainfrom
obchain:fix/1435-orchestrator-update-tools
May 12, 2026
Merged

feat(orchestrator): expose update_check + update_apply tools (#1435)#1473
senamakel merged 6 commits into
tinyhumansai:mainfrom
obchain:fix/1435-orchestrator-update-tools

Conversation

@obchain
Copy link
Copy Markdown
Contributor

@obchain obchain commented May 11, 2026

Summary

  • New orchestrator tools update_check and update_apply so the user can ask "am I up to date?" / "update OpenHuman" in chat instead of opening Settings → Developer Options.
  • Implementation is a thin wrapper over the existing openhuman.update_* RPC surface — no duplicate release URL, asset validation, or staging logic; staging stays in src/openhuman/update/.
  • update_apply is gated three ways: tool-level user_confirmed: true arg (LLM must explicitly call ask_user_clarification first), tool-level autonomy check (SecurityPolicy::can_act), and the existing config.update.rpc_mutations_enabled policy gate inside update_run.

Problem

src/openhuman/update/ already implements version, check, apply, and run as JSON-RPC controllers, and the frontend uses them through the Settings → Developer Options UI. The orchestrator's tool list (src/openhuman/agent/agents/orchestrator/agent.toml) carried no update tools, so the agent could not initiate a check or staged upgrade in chat even though every primitive existed at the core layer.

Solution

  • Added src/openhuman/tools/impl/system/update_check.rs — read-only, closed object schema, calls update::rpc::update_check. PermissionLevel::ReadOnly.
  • Added src/openhuman/tools/impl/system/update_apply.rsPermissionLevel::Dangerous, requires user_confirmed: true, runs the autonomy gate (SecurityPolicy::can_act + record_action), then delegates to update::rpc::update_run. The downstream RPC re-checks config.update.rpc_mutations_enabled and applies the configured restart_strategy.
  • Registered both in tools::ops::all_tools_with_runtime so they ship with the default registry.
  • Listed both in the orchestrator's [tools] named block (the only agent that needs them today).
  • Added a ("update", &["update_check", "update_apply"]) row to tools::user_filter so the onboarding tool toggles can opt them in/out.
  • Extended the update.check / update.apply capability entries in about_app::catalog to mention the in-chat path and the consent gate, so the capability surface stays the source of truth.

The update_apply tool description tells the LLM: confirm via ask_user_clarification, then call again with user_confirmed: true. That keeps the consent decision visible in the conversation transcript instead of buried in a tool argument.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/coverage.yml. 11 new tests cover both tools' name/permission/schema, the consent gate (missing arg, false arg), the autonomy gate (read-only mode), and the gate ordering (consent first).
  • Coverage matrix updated — N/A: behaviour-only change wiring an existing RPC as an additional tool surface.
  • All affected feature IDs from the matrix are listed in the PR description under ## Related
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md) — N/A: orchestrator-only change, Settings flow unchanged.
  • Linked issue closed via Closes #NNN in the ## Related section

Impact

  • Runtime: desktop only (the update domain is desktop-scoped). No new dependencies, no schema migration. Network surface unchanged — update_check does the same single GitHub Releases request that the existing RPC has always done.
  • Security: applying an update is high impact, so the tool stacks three independent gates (LLM-visible consent arg + autonomy policy + RPC mutation policy). Read-only sessions can never trigger a staged binary even if the LLM tries.
  • UX: the orchestrator can answer "any updates?" without delegating to Settings.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/1435-orchestrator-update-tools
  • Commit SHA: f1566ec

Summary by CodeRabbit

  • New Features

    • Agent can check for updates and apply them directly from the chat/orchestrator (apply requires explicit user confirmation and is marked high-impact).
    • Users can ask “am I up to date” and trigger updates without opening Settings.
    • Update controls respect user-enabled tool preferences.
  • Tests

    • Added tests validating tool metadata, permission behavior, consent gating, and execution outcomes.

Review Change Stack

Wires the existing `openhuman.update_*` RPC surface as orchestrator
tools so the user can ask "am I up to date" / "update OpenHuman" in
chat. Both tools delegate to `update::rpc::*` — no duplicate release
URL, asset validation, or staging logic. `update_apply` is gated on
explicit user consent (`user_confirmed: true`) and a tool-level
autonomy check before reaching the existing `rpc_mutations_enabled`
policy. Closes tinyhumansai#1435.
@obchain obchain requested a review from a team May 11, 2026 06:33
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 11, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d157bdeb-f73f-4252-a752-6725f4d53103

📥 Commits

Reviewing files that changed from the base of the PR and between abc6236 and 8d8c0c8.

📒 Files selected for processing (1)
  • src/openhuman/update/ops.rs

📝 Walkthrough

Walkthrough

Adds orchestrator-callable update tools: read-only update_check and dangerous update_apply requiring explicit user_confirmed consent and SecurityPolicy gating. Tools call update::rpc, are exported and registered with the orchestrator, documented in the app catalog, made filterable by the UI toggle, and covered by unit tests.

Changes

Orchestrator Update Tools

Layer / File(s) Summary
Capability Documentation
src/openhuman/about_app/catalog.rs
Adds GITHUB_RELEASES_METADATA and expands update.check/update.apply descriptions/how-to to document orchestrator tool exposure (update_check, update_apply), GitHub Releases destination, consent gating, and restart handling.
Tool Module Exports
src/openhuman/tools/impl/system/mod.rs
Declares update_apply and update_check submodules and re-exports UpdateApplyTool and UpdateCheckTool.
Read-Only Check Tool
src/openhuman/tools/impl/system/update_check.rs
UpdateCheckTool implements Tool (name, description, empty params schema, ReadOnly) and execute() that calls update::rpc::update_check, logs RPC logs, pretty-serializes the result, and returns success/error based on an "error" key.
Check Tool Tests
src/openhuman/tools/impl/system/update_check_tests.rs
Tests verify tool name, ReadOnly permission, closed empty parameters schema, description text (references update_apply, states "Does NOT download"), and Default impl.
Dangerous Apply Tool
src/openhuman/tools/impl/system/update_apply.rs
UpdateApplyTool stores Arc<SecurityPolicy>, enforces user_confirmed: true consent and autonomy/write gates (ToolOperation::Act), defines a Dangerous/HIGH IMPACT description and JSON schema, calls update::rpc::update_run, logs RPC logs, and interprets error or applied:false as failure.
Apply Tool Tests
src/openhuman/tools/impl/system/update_apply_tests.rs
Tests validate metadata (name, Dangerous), JSON schema requiring boolean user_confirmed, description content (HIGH IMPACT, ask_user_clarification, update_check reference), and execution gating/precedence across Supervised vs ReadOnly.
Tool Registration & Orchestrator Config
src/openhuman/tools/ops.rs, src/openhuman/agent/agents/orchestrator/agent.toml
Registers UpdateCheckTool and UpdateApplyTool in all_tools_with_runtime (apply constructed with security) and adds both to the orchestrator's named tools list with docs and config.update.rpc_mutations_enabled gating.
User Tool Filtering
src/openhuman/tools/user_filter.rs
Maps UI update toggle to update_check and update_apply so user preference filtering can hide them.
Security Export
src/openhuman/security/mod.rs
Re-exports policy::ToolOperation for tool gating usage.
Test Serialization Locking
src/openhuman/update/ops.rs
Serializes tests touching OPENHUMAN_WORKSPACE using TEST_ENV_LOCK to avoid cross-test races when loading mutation policy for update_apply.

Sequence Diagram

sequenceDiagram
  participant Orchestrator as Orchestrator/LLM
  participant CheckTool as UpdateCheckTool
  participant ApplyTool as UpdateApplyTool
  participant Policy as SecurityPolicy
  participant RPC as UpdateRPC

  Orchestrator->>CheckTool: execute()
  CheckTool->>RPC: update_check()
  RPC-->>CheckTool: {value, logs}
  CheckTool-->>Orchestrator: ToolResult (success/error)

  Orchestrator->>ApplyTool: execute({user_confirmed:true})
  ApplyTool->>ApplyTool: validate consent (user_confirmed)
  ApplyTool->>Policy: enforce_tool_operation(Act)
  alt policy allows
    ApplyTool->>RPC: update_run()
    RPC-->>ApplyTool: {value, logs}
    ApplyTool-->>Orchestrator: ToolResult (success/error)
  else policy blocks
    ApplyTool-->>Orchestrator: ToolResult error
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • senamakel

Poem

🐰 I hop to check the release with care,

"Am I current?" I whisper in the air.
A nod confirms, the apply may start,
Policy guards each cautious part.
A rabbit hum: safe updates, heart to heart.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.97% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: exposing update_check and update_apply tools to the orchestrator.
Linked Issues check ✅ Passed The PR comprehensively meets all objectives from #1435: exposes update tools via Tool implementations reusing src/openhuman/update/, registers them in tools::ops and orchestrator config, enforces user confirmation and policy gates, updates catalog documentation, achieves ≥80% diff coverage, and includes comprehensive tests.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the two update tools and their integration: tool implementations, registration, configuration, tests, and documentation updates. No unrelated refactoring or feature scope creep detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/openhuman/about_app/catalog.rs (1)

915-934: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Privacy metadata for update capabilities is inconsistent with the updated behavior.

The descriptions now state these flows hit GitHub Releases (and update.apply downloads artifacts), but privacy still reports DIAGNOSTICS_TO_BACKEND / None. That under-reports outbound destinations in the capability catalog.

Suggested fix
     Capability {
         id: "update.check",
@@
-        privacy: DIAGNOSTICS_TO_BACKEND,
+        privacy: Some(CapabilityPrivacy {
+            leaves_device: true,
+            data_kind: PrivacyDataKind::Metadata,
+            destinations: &["GitHub Releases"],
+        }),
     },
     Capability {
         id: "update.apply",
@@
-        privacy: None,
+        privacy: Some(CapabilityPrivacy {
+            leaves_device: true,
+            data_kind: PrivacyDataKind::Metadata,
+            destinations: &["GitHub Releases"],
+        }),
     },

As per coding guidelines: "Update src/openhuman/about_app/ in the same work when a change adds, removes, renames, or materially changes a user-facing feature so the runtime capability catalog remains the source of truth".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/about_app/catalog.rs` around lines 915 - 934, The capability
privacy metadata for the update capabilities is incorrect: update.check and
update.apply now contact GitHub Releases but still use DIAGNOSTICS_TO_BACKEND /
None; change the privacy fields on the Capability entries with id "update.check"
and "update.apply" in catalog.rs to reflect outbound network calls to a
third‑party (e.g., set to the project’s convention for external requests such as
OUTBOUND_THIRD_PARTY or a privacy enum that captures outbound requests and
include host metadata "github.com" if supported) so the catalog accurately shows
these flows; keep descriptions and gating text unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/openhuman/about_app/catalog.rs`:
- Around line 915-934: The capability privacy metadata for the update
capabilities is incorrect: update.check and update.apply now contact GitHub
Releases but still use DIAGNOSTICS_TO_BACKEND / None; change the privacy fields
on the Capability entries with id "update.check" and "update.apply" in
catalog.rs to reflect outbound network calls to a third‑party (e.g., set to the
project’s convention for external requests such as OUTBOUND_THIRD_PARTY or a
privacy enum that captures outbound requests and include host metadata
"github.com" if supported) so the catalog accurately shows these flows; keep
descriptions and gating text unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2299c0ab-8645-42f0-b1ad-768449a070fb

📥 Commits

Reviewing files that changed from the base of the PR and between 838e6fc and f1566ec.

📒 Files selected for processing (9)
  • src/openhuman/about_app/catalog.rs
  • src/openhuman/agent/agents/orchestrator/agent.toml
  • src/openhuman/tools/impl/system/mod.rs
  • src/openhuman/tools/impl/system/update_apply.rs
  • src/openhuman/tools/impl/system/update_apply_tests.rs
  • src/openhuman/tools/impl/system/update_check.rs
  • src/openhuman/tools/impl/system/update_check_tests.rs
  • src/openhuman/tools/ops.rs
  • src/openhuman/tools/user_filter.rs

coderabbitai[bot]
coderabbitai Bot previously approved these changes May 11, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #1473 — feat(orchestrator): expose update_check + update_apply tools

Walkthrough

This PR wires two new orchestrator tools — update_check and update_apply — over the existing src/openhuman/update/rpc surface. The design is correct: it does not duplicate any release-URL resolution, version-comparison, or asset-validation logic; all of that stays in the update domain. The update_apply tool layers three independent gates (user consent arg, autonomy policy, RPC mutation policy) which is the right level of caution for a tool that replaces a binary on disk. Tests cover the gate-ordering, schema shape, and the read-only-session path. The implementation is lean, readable, and follows the module layout rules.

Two issues are worth addressing before merge: the tool bypasses SecurityPolicy::enforce_tool_operation in favour of a hand-rolled duplicate, and update_check is missing an exit log so failures are invisible until the underlying RPC logs kick in. The remaining items below are minor polish.

Changes

File Summary
src/openhuman/about_app/catalog.rs Extends update.check / update.apply capability descriptions to mention the new in-chat path and consent gate
src/openhuman/agent/agents/orchestrator/agent.toml Registers update_check and update_apply in the orchestrator's [tools] named block
src/openhuman/tools/impl/system/mod.rs Declares and re-exports the two new modules
src/openhuman/tools/impl/system/update_apply.rs NEW: update_apply tool — consent + autonomy gates, delegates to update::rpc::update_run
src/openhuman/tools/impl/system/update_apply_tests.rs NEW: 7 tests covering name/permission, schema shape, missing consent, false consent, read-only block, gate ordering
src/openhuman/tools/impl/system/update_check.rs NEW: update_check tool — read-only pass-through to update::rpc::update_check
src/openhuman/tools/impl/system/update_check_tests.rs NEW: 4 tests covering name/permission, closed schema, description content, Default impl
src/openhuman/tools/ops.rs Appends UpdateCheckTool and UpdateApplyTool to the all_tools_with_runtime registry
src/openhuman/tools/user_filter.rs Adds ("update", &["update_check", "update_apply"]) to the filterable tool map

Actionable comments (3)

⚠️ Major

1. src/openhuman/tools/impl/system/update_apply.rs:40-52 — hand-rolled autonomy gate duplicates SecurityPolicy::enforce_tool_operation

require_write_access calls can_act() and then record_action() manually. Every other act-level tool in this codebase (memory_store, memory_forget, composio.execute, delegate) uses the shared SecurityPolicy::enforce_tool_operation(ToolOperation::Act, …) helper, which also produces structured log output at [openhuman:policy]. The hand-rolled path does log, but the messages differ from the pattern established elsewhere — inconsistent error strings make grepping harder when debugging production issues.

The two approaches are semantically identical today, but if enforce_tool_operation ever gains additional checks (e.g. a token-budget gate or a supervised-approval path) this tool will silently miss them.

Suggested change:

// before (update_apply.rs:40-52)
fn require_write_access(&self) -> Option<ToolResult> {
    if !self.security.can_act() {
        return Some(ToolResult::error(
            "update_apply blocked: autonomy is read-only — confirm with the user and \
             raise autonomy before retrying",
        ));
    }
    if !self.security.record_action() {
        return Some(ToolResult::error(
            "update_apply blocked: autonomy rate limit exceeded",
        ));
    }
    None
}

// after — delegate to the shared enforcer
fn require_write_access(&self) -> Option<ToolResult> {
    self.security
        .enforce_tool_operation(
            crate::openhuman::security::ToolOperation::Act,
            "update_apply",
        )
        .err()
        .map(ToolResult::error)
}

This keeps the existing test coverage valid (the gate still rejects read-only sessions) while staying in sync with the rest of the codebase.


💡 Refactor / suggestion

2. src/openhuman/tools/impl/system/update_check.rs:58-71 — no exit log; outcome (success vs error) is invisible at the tool layer

update_check logs entry but not exit. When the underlying GitHub request fails, the only trace is a debug line inside the for log in &outcome.logs loop — which is at target "update_check" and won't show up in a standard [update_check] grep. A reader tailing logs to understand why the orchestrator reported "no update available" has no fast path from the tool frame to the outcome.

Compare update_apply, which at least logs the two blocked paths explicitly. update_check should log an exit line that mentions the outcome (update available / up-to-date / error):

// after execute() in update_check.rs
async fn execute(&self, _args: Value) -> anyhow::Result<ToolResult> {
    tracing::debug!("[update_check] execute start");
    let outcome = update::rpc::update_check().await;
    let body = serde_json::to_string_pretty(&outcome.value)?;
    for log in &outcome.logs {
        tracing::debug!(target: "update_check", "{log}");
    }
    let is_error = outcome.value.get("error").is_some();
    tracing::debug!(
        is_error,
        "[update_check] execute done"
    );
    Ok(if is_error {
        ToolResult::error(body)
    } else {
        ToolResult::success(body)
    })
}

3. src/openhuman/tools/impl/system/update_apply.rs:125 — error detection by key presence is fragile

let is_error = outcome.value.get("error").is_some();

update::rpc::update_run returns a well-typed RpcOutcome<Value> whose .value is produced by serde_json::to_value(&result). The UpdateRunResult struct does not have an "error" field — errors are signalled by returning a different shape (json!({ "error": … })). That means this check will silently return ToolResult::success for an UpdateRunResult where applied: false and message describes a failure (e.g. the "no platform asset" path, or the "download/stage failed" path). The user sees a green tick in the conversation even though nothing was installed.

update_check.rs:65 has the exact same pattern.

The cleanest fix is to check outcome.value["applied"] for update_run and look for the absence of the error key being a soft-success. But until RpcOutcome carries an explicit status field, the safest short-term approach is also checking applied:

// update_apply.rs — replace the is_error check
let is_error = outcome.value.get("error").is_some()
    || outcome.value.get("applied").and_then(Value::as_bool) == Some(false);

For update_check the semantics are different (a "no update" result is a success), so no change needed there, but it is worth documenting that assumption with a brief comment so future readers don't apply the same fix incorrectly.


Nitpicks (2)

  • src/openhuman/tools/impl/system/update_check.rs:59 — the entry log uses tracing::debug! while the blocked paths in update_apply use tracing::warn!. Fine as-is (check is read-only), but the pattern is slightly inconsistent. Consider tracing::debug! for entry and tracing::info! for the exit summary so the tool shows up in INFO logs alongside the underlying log::info! calls in update::rpc.
  • src/openhuman/tools/impl/system/update_apply_tests.rs:1-4use std::sync::Arc; is imported at the top of the test file while Arc is already in scope from super::* via the production module. Not a problem, just redundant; cargo check won't complain because it just aliases it.

Questions for the author (1)

  • src/openhuman/tools/impl/system/update_apply.rs:120update_run stages the binary and then publishes a self-restart event. From the agent's perspective, the core process exits shortly after this call returns. Does the orchestrator's tool runner (or the core_rpc_relay relay in the Tauri shell) handle a mid-conversation disconnect gracefully, or does the user see an error? If the connection drops before the LLM can emit a "restarting…" message, the UX could be confusing. Not a blocker — just worth a comment in the tool description or a follow-up issue.

Outside the diff

While reading update::rpc::update_run I noticed the already_current_result and missing_asset_result helper paths both return applied: false without an "error" key — meaning the fragile is_error check in update_apply.rs (item 3 above) will surface both of these as apparent successes to the LLM. Worth fixing in this PR rather than deferring.


Verified / looks good

  • New files are correctly placed in src/openhuman/tools/impl/system/ — no standalone .rs at the root of src/openhuman/.
  • PermissionLevel::ReadOnly for update_check and PermissionLevel::Dangerous for update_apply are appropriate.
  • No bare .unwrap() in production code paths (only in test helpers via expect).
  • No PII or secrets logged — the entry log in update_apply prints the full args JSON, but the only field is the user_confirmed boolean, which is safe.
  • user_filter.rs mapping groups both tools under a single "update" toggle — correct, and the comment explains the rationale.
  • Test gate-ordering (consent_check_runs_before_autonomy_check) is particularly well thought-out.
  • CI is green and coverage gate passes.

Reply with one of:

  • apply all — apply every suggestion above
  • apply 1,2,3 — apply specific numbered items
  • apply 1 — just the enforce_tool_operation refactor
  • skip — review only, no changes

I will not change any code until you confirm.

Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — project-specific findings

Overall this is a clean, well-structured PR — thin wrappers over the existing RPC surface, correct permission levels, no duplicated release logic, good test coverage including gate-ordering verification. Three findings worth considering:

[minor] Missing exit log in update_check

update_check.rs:58-71 — there's an entry log ([update_check] execute start) but no exit log. When the GitHub request fails, the only trace lands in the outcome.logs loop under a different tracing target — invisible to a standard [update_check] grep. Same applies to update_apply on the success/error branch.

// Suggested addition after the outcome.logs loop in both tools:
tracing::debug!(is_error, "[update_check] execute done");

Per CLAUDE.md: "Log entry/exit, branches, external calls, retries/timeouts, state transitions, errors."

[minor] Soft-failure detection via outcome.value.get("error")

update_apply.rs:125 / update_check.rs:65 — error detection checks for an "error" key, but update_run's "already current" and "no platform asset" paths return applied: false with a message field and no "error" key. The tool will report ToolResult::success even though nothing was installed.

Consider also checking for applied: false or treating the absence of an explicit success indicator as a distinguishable outcome (not necessarily an error, but worth surfacing clearly to the LLM so it doesn't tell the user "update complete" when nothing happened).

[question] Self-restart mid-conversation

When update_run triggers a self_replace restart, the core process exits. Does the Tauri core_rpc_relay handle the disconnect gracefully, or will the user see an unhandled RPC error in the conversation? Not a blocker for this PR — the behavior exists today via the Settings path — but worth a note in the tool description or a follow-up issue since the in-chat path makes it more likely users will be mid-conversation when it happens.


Nit: update_apply_tests.rs:3use std::sync::Arc is redundant since super::* already brings it into scope.

LGTM with the minor logging/detection items above. Nice work keeping the tool thin and reusing the existing update domain end-to-end.

obchain added 4 commits May 11, 2026 19:24
Hand-rolling `can_act` + `record_action` inside `UpdateApplyTool` left
the tool's autonomy + rate-limit handling out of step with every other
act-level tool (memory_store, memory_forget, composio.execute,
delegate) — different error strings made grepping production logs
harder, and any future gate added to `SecurityPolicy::enforce_tool_operation`
(supervised-approval, token budget, etc.) would have silently bypassed
this tool. Route through the shared enforcer with `ToolOperation::Act`
and re-export `ToolOperation` from `openhuman::security` so callers
don't have to reach into `policy`. Test updated to assert against the
canonical `read-only mode` / `update_apply` phrasing from the shared
path.
`update_run`'s soft-failure paths ("already current", "no platform
asset for this target", "download/stage failed") return
`applied: false` with a descriptive `message` and no `error` key. The
old `outcome.value.get("error").is_some()` check thus mapped every
one of those to `ToolResult::success`, so the LLM would tell the user
"update applied" even though nothing was installed. Read `applied`
explicitly and treat any non-applied outcome as an error so the
orchestrator surfaces the real status. Adds an
`[update_apply] execute done` debug log carrying the applied /
has_error_key / is_error flags so the resolution is visible without
re-running.
…e_check]`

`update_check` already logged on entry but not on exit, so when the
underlying release-feed RPC errored the only trace lived on the
`update_check` target inside the `outcome.logs` loop — invisible to a
plain `[update_check]` grep at the tool layer. Add a `[update_check]
execute done` line that surfaces `is_error` and the response body
size, and explain inline why the `is_error` check intentionally only
trips on an explicit `error` key (read-only checks must keep treating
"no update available" as a happy answer, unlike `update_apply`).
`update.check` was tagged `DIAGNOSTICS_TO_BACKEND` and `update.apply`
was tagged `None`, but neither flow talks to the OpenHuman backend —
both hit GitHub Releases directly to discover release metadata and
fetch the platform asset. That under-reported the outbound
destination in the capability catalog, which exists to be the source
of truth for "where does my data go" surfaces.

Introduce a `GITHUB_RELEASES_METADATA` constant
(`leaves_device: true`, `data_kind: Metadata`,
`destinations: ["GitHub Releases"]`) and apply it to both update
capabilities so the catalog reflects the real network destination.
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 11, 2026
@obchain
Copy link
Copy Markdown
Contributor Author

obchain commented May 11, 2026

Thanks for the review — pushed fixes for everything actionable.

  • Autonomy gate now delegates to SecurityPolicy::enforce_tool_operation(Act, ...) instead of the hand-rolled can_act + record_action pair (8a9fd0db). Test updated to assert on the shared read-only mode message.
  • is_error for update_apply now also trips on applied: false, so soft failures like "already current" or "no platform asset" no longer surface as green ticks (b7137863).
  • Added an [update_check] execute done exit log with is_error + body size (207bba0a). Kept its check narrower than apply's on purpose — "no update available" is still a happy answer.
  • Capability privacy for update.check / update.apply now points at GitHub Releases via a new GITHUB_RELEASES_METADATA constant (abc62361), addressing the catalog under-reporting.

Skipped the use std::sync::Arc nit — Arc isn't actually re-exported from super::* here, so dropping it would break compile. Self-restart UX worth a follow-up issue rather than scope creep on this PR.

Copy link
Copy Markdown
Member

@senamakel senamakel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one failing test do have a look

`update_apply_rejects_when_rpc_mutations_disabled` failed in CI
because two sibling tests (`update_apply_rejects_non_github_url_before_network_call`,
`update_apply_rejects_unsafe_asset_name`) called `update_apply` without
holding `TEST_ENV_LOCK`. `update_apply` reads the mutation-policy
config via `OPENHUMAN_WORKSPACE`, which is process-global, so a
sibling running on another thread could clobber the env var between
the disabled test's `WorkspaceEnvGuard::set` and the policy load
inside `update_apply`. The disabled test then loaded a default policy
(where `rpc_mutations_enabled = true`), the gate passed, and its
`outcome.value["error"].contains("rpc_mutations_enabled=false")`
assertion failed.

Take `TEST_ENV_LOCK` in both sibling tests so all three serialise on
the same mutex. The validation-only tests still skip mock setup; they
just no longer race against the policy-gated case.
@obchain
Copy link
Copy Markdown
Contributor Author

obchain commented May 11, 2026

@senamakel CI flake fixed in 8d8c0c8

@obchain obchain requested a review from senamakel May 11, 2026 20:05
@senamakel senamakel merged commit 018619d into tinyhumansai:main May 12, 2026
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Give the orchestrator agent tools to check and apply core updates (with explicit user gate)

3 participants