feat(orchestrator): expose update_check + update_apply tools (#1435)#1473
Conversation
Wires the existing `openhuman.update_*` RPC surface as orchestrator tools so the user can ask "am I up to date" / "update OpenHuman" in chat. Both tools delegate to `update::rpc::*` — no duplicate release URL, asset validation, or staging logic. `update_apply` is gated on explicit user consent (`user_confirmed: true`) and a tool-level autonomy check before reaching the existing `rpc_mutations_enabled` policy. Closes tinyhumansai#1435.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds orchestrator-callable update tools: read-only ChangesOrchestrator Update Tools
Sequence DiagramsequenceDiagram
participant Orchestrator as Orchestrator/LLM
participant CheckTool as UpdateCheckTool
participant ApplyTool as UpdateApplyTool
participant Policy as SecurityPolicy
participant RPC as UpdateRPC
Orchestrator->>CheckTool: execute()
CheckTool->>RPC: update_check()
RPC-->>CheckTool: {value, logs}
CheckTool-->>Orchestrator: ToolResult (success/error)
Orchestrator->>ApplyTool: execute({user_confirmed:true})
ApplyTool->>ApplyTool: validate consent (user_confirmed)
ApplyTool->>Policy: enforce_tool_operation(Act)
alt policy allows
ApplyTool->>RPC: update_run()
RPC-->>ApplyTool: {value, logs}
ApplyTool-->>Orchestrator: ToolResult (success/error)
else policy blocks
ApplyTool-->>Orchestrator: ToolResult error
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/openhuman/about_app/catalog.rs (1)
915-934:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPrivacy metadata for update capabilities is inconsistent with the updated behavior.
The descriptions now state these flows hit GitHub Releases (and
update.applydownloads artifacts), butprivacystill reportsDIAGNOSTICS_TO_BACKEND/None. That under-reports outbound destinations in the capability catalog.Suggested fix
Capability { id: "update.check", @@ - privacy: DIAGNOSTICS_TO_BACKEND, + privacy: Some(CapabilityPrivacy { + leaves_device: true, + data_kind: PrivacyDataKind::Metadata, + destinations: &["GitHub Releases"], + }), }, Capability { id: "update.apply", @@ - privacy: None, + privacy: Some(CapabilityPrivacy { + leaves_device: true, + data_kind: PrivacyDataKind::Metadata, + destinations: &["GitHub Releases"], + }), },As per coding guidelines: "Update src/openhuman/about_app/ in the same work when a change adds, removes, renames, or materially changes a user-facing feature so the runtime capability catalog remains the source of truth".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/about_app/catalog.rs` around lines 915 - 934, The capability privacy metadata for the update capabilities is incorrect: update.check and update.apply now contact GitHub Releases but still use DIAGNOSTICS_TO_BACKEND / None; change the privacy fields on the Capability entries with id "update.check" and "update.apply" in catalog.rs to reflect outbound network calls to a third‑party (e.g., set to the project’s convention for external requests such as OUTBOUND_THIRD_PARTY or a privacy enum that captures outbound requests and include host metadata "github.com" if supported) so the catalog accurately shows these flows; keep descriptions and gating text unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/openhuman/about_app/catalog.rs`:
- Around line 915-934: The capability privacy metadata for the update
capabilities is incorrect: update.check and update.apply now contact GitHub
Releases but still use DIAGNOSTICS_TO_BACKEND / None; change the privacy fields
on the Capability entries with id "update.check" and "update.apply" in
catalog.rs to reflect outbound network calls to a third‑party (e.g., set to the
project’s convention for external requests such as OUTBOUND_THIRD_PARTY or a
privacy enum that captures outbound requests and include host metadata
"github.com" if supported) so the catalog accurately shows these flows; keep
descriptions and gating text unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2299c0ab-8645-42f0-b1ad-768449a070fb
📒 Files selected for processing (9)
src/openhuman/about_app/catalog.rssrc/openhuman/agent/agents/orchestrator/agent.tomlsrc/openhuman/tools/impl/system/mod.rssrc/openhuman/tools/impl/system/update_apply.rssrc/openhuman/tools/impl/system/update_apply_tests.rssrc/openhuman/tools/impl/system/update_check.rssrc/openhuman/tools/impl/system/update_check_tests.rssrc/openhuman/tools/ops.rssrc/openhuman/tools/user_filter.rs
graycyrus
left a comment
There was a problem hiding this comment.
PR #1473 — feat(orchestrator): expose update_check + update_apply tools
Walkthrough
This PR wires two new orchestrator tools — update_check and update_apply — over the existing src/openhuman/update/rpc surface. The design is correct: it does not duplicate any release-URL resolution, version-comparison, or asset-validation logic; all of that stays in the update domain. The update_apply tool layers three independent gates (user consent arg, autonomy policy, RPC mutation policy) which is the right level of caution for a tool that replaces a binary on disk. Tests cover the gate-ordering, schema shape, and the read-only-session path. The implementation is lean, readable, and follows the module layout rules.
Two issues are worth addressing before merge: the tool bypasses SecurityPolicy::enforce_tool_operation in favour of a hand-rolled duplicate, and update_check is missing an exit log so failures are invisible until the underlying RPC logs kick in. The remaining items below are minor polish.
Changes
| File | Summary |
|---|---|
src/openhuman/about_app/catalog.rs |
Extends update.check / update.apply capability descriptions to mention the new in-chat path and consent gate |
src/openhuman/agent/agents/orchestrator/agent.toml |
Registers update_check and update_apply in the orchestrator's [tools] named block |
src/openhuman/tools/impl/system/mod.rs |
Declares and re-exports the two new modules |
src/openhuman/tools/impl/system/update_apply.rs |
NEW: update_apply tool — consent + autonomy gates, delegates to update::rpc::update_run |
src/openhuman/tools/impl/system/update_apply_tests.rs |
NEW: 7 tests covering name/permission, schema shape, missing consent, false consent, read-only block, gate ordering |
src/openhuman/tools/impl/system/update_check.rs |
NEW: update_check tool — read-only pass-through to update::rpc::update_check |
src/openhuman/tools/impl/system/update_check_tests.rs |
NEW: 4 tests covering name/permission, closed schema, description content, Default impl |
src/openhuman/tools/ops.rs |
Appends UpdateCheckTool and UpdateApplyTool to the all_tools_with_runtime registry |
src/openhuman/tools/user_filter.rs |
Adds ("update", &["update_check", "update_apply"]) to the filterable tool map |
Actionable comments (3)
⚠️ Major
1. src/openhuman/tools/impl/system/update_apply.rs:40-52 — hand-rolled autonomy gate duplicates SecurityPolicy::enforce_tool_operation
require_write_access calls can_act() and then record_action() manually. Every other act-level tool in this codebase (memory_store, memory_forget, composio.execute, delegate) uses the shared SecurityPolicy::enforce_tool_operation(ToolOperation::Act, …) helper, which also produces structured log output at [openhuman:policy]. The hand-rolled path does log, but the messages differ from the pattern established elsewhere — inconsistent error strings make grepping harder when debugging production issues.
The two approaches are semantically identical today, but if enforce_tool_operation ever gains additional checks (e.g. a token-budget gate or a supervised-approval path) this tool will silently miss them.
Suggested change:
// before (update_apply.rs:40-52)
fn require_write_access(&self) -> Option<ToolResult> {
if !self.security.can_act() {
return Some(ToolResult::error(
"update_apply blocked: autonomy is read-only — confirm with the user and \
raise autonomy before retrying",
));
}
if !self.security.record_action() {
return Some(ToolResult::error(
"update_apply blocked: autonomy rate limit exceeded",
));
}
None
}
// after — delegate to the shared enforcer
fn require_write_access(&self) -> Option<ToolResult> {
self.security
.enforce_tool_operation(
crate::openhuman::security::ToolOperation::Act,
"update_apply",
)
.err()
.map(ToolResult::error)
}This keeps the existing test coverage valid (the gate still rejects read-only sessions) while staying in sync with the rest of the codebase.
💡 Refactor / suggestion
2. src/openhuman/tools/impl/system/update_check.rs:58-71 — no exit log; outcome (success vs error) is invisible at the tool layer
update_check logs entry but not exit. When the underlying GitHub request fails, the only trace is a debug line inside the for log in &outcome.logs loop — which is at target "update_check" and won't show up in a standard [update_check] grep. A reader tailing logs to understand why the orchestrator reported "no update available" has no fast path from the tool frame to the outcome.
Compare update_apply, which at least logs the two blocked paths explicitly. update_check should log an exit line that mentions the outcome (update available / up-to-date / error):
// after execute() in update_check.rs
async fn execute(&self, _args: Value) -> anyhow::Result<ToolResult> {
tracing::debug!("[update_check] execute start");
let outcome = update::rpc::update_check().await;
let body = serde_json::to_string_pretty(&outcome.value)?;
for log in &outcome.logs {
tracing::debug!(target: "update_check", "{log}");
}
let is_error = outcome.value.get("error").is_some();
tracing::debug!(
is_error,
"[update_check] execute done"
);
Ok(if is_error {
ToolResult::error(body)
} else {
ToolResult::success(body)
})
}3. src/openhuman/tools/impl/system/update_apply.rs:125 — error detection by key presence is fragile
let is_error = outcome.value.get("error").is_some();update::rpc::update_run returns a well-typed RpcOutcome<Value> whose .value is produced by serde_json::to_value(&result). The UpdateRunResult struct does not have an "error" field — errors are signalled by returning a different shape (json!({ "error": … })). That means this check will silently return ToolResult::success for an UpdateRunResult where applied: false and message describes a failure (e.g. the "no platform asset" path, or the "download/stage failed" path). The user sees a green tick in the conversation even though nothing was installed.
update_check.rs:65 has the exact same pattern.
The cleanest fix is to check outcome.value["applied"] for update_run and look for the absence of the error key being a soft-success. But until RpcOutcome carries an explicit status field, the safest short-term approach is also checking applied:
// update_apply.rs — replace the is_error check
let is_error = outcome.value.get("error").is_some()
|| outcome.value.get("applied").and_then(Value::as_bool) == Some(false);For update_check the semantics are different (a "no update" result is a success), so no change needed there, but it is worth documenting that assumption with a brief comment so future readers don't apply the same fix incorrectly.
Nitpicks (2)
src/openhuman/tools/impl/system/update_check.rs:59— the entry log usestracing::debug!while the blocked paths inupdate_applyusetracing::warn!. Fine as-is (check is read-only), but the pattern is slightly inconsistent. Considertracing::debug!for entry andtracing::info!for the exit summary so the tool shows up inINFOlogs alongside the underlyinglog::info!calls inupdate::rpc.src/openhuman/tools/impl/system/update_apply_tests.rs:1-4—use std::sync::Arc;is imported at the top of the test file whileArcis already in scope fromsuper::*via the production module. Not a problem, just redundant;cargo checkwon't complain because it just aliases it.
Questions for the author (1)
src/openhuman/tools/impl/system/update_apply.rs:120—update_runstages the binary and then publishes a self-restart event. From the agent's perspective, the core process exits shortly after this call returns. Does the orchestrator's tool runner (or thecore_rpc_relayrelay in the Tauri shell) handle a mid-conversation disconnect gracefully, or does the user see an error? If the connection drops before the LLM can emit a "restarting…" message, the UX could be confusing. Not a blocker — just worth a comment in the tool description or a follow-up issue.
Outside the diff
While reading update::rpc::update_run I noticed the already_current_result and missing_asset_result helper paths both return applied: false without an "error" key — meaning the fragile is_error check in update_apply.rs (item 3 above) will surface both of these as apparent successes to the LLM. Worth fixing in this PR rather than deferring.
Verified / looks good
- New files are correctly placed in
src/openhuman/tools/impl/system/— no standalone.rsat the root ofsrc/openhuman/. PermissionLevel::ReadOnlyforupdate_checkandPermissionLevel::Dangerousforupdate_applyare appropriate.- No bare
.unwrap()in production code paths (only in test helpers viaexpect). - No PII or secrets logged — the entry log in
update_applyprints the fullargsJSON, but the only field is theuser_confirmedboolean, which is safe. user_filter.rsmapping groups both tools under a single"update"toggle — correct, and the comment explains the rationale.- Test gate-ordering (
consent_check_runs_before_autonomy_check) is particularly well thought-out. - CI is green and coverage gate passes.
Reply with one of:
apply all— apply every suggestion aboveapply 1,2,3— apply specific numbered itemsapply 1— just theenforce_tool_operationrefactorskip— review only, no changes
I will not change any code until you confirm.
graycyrus
left a comment
There was a problem hiding this comment.
Review — project-specific findings
Overall this is a clean, well-structured PR — thin wrappers over the existing RPC surface, correct permission levels, no duplicated release logic, good test coverage including gate-ordering verification. Three findings worth considering:
[minor] Missing exit log in update_check
update_check.rs:58-71 — there's an entry log ([update_check] execute start) but no exit log. When the GitHub request fails, the only trace lands in the outcome.logs loop under a different tracing target — invisible to a standard [update_check] grep. Same applies to update_apply on the success/error branch.
// Suggested addition after the outcome.logs loop in both tools:
tracing::debug!(is_error, "[update_check] execute done");Per CLAUDE.md: "Log entry/exit, branches, external calls, retries/timeouts, state transitions, errors."
[minor] Soft-failure detection via outcome.value.get("error")
update_apply.rs:125 / update_check.rs:65 — error detection checks for an "error" key, but update_run's "already current" and "no platform asset" paths return applied: false with a message field and no "error" key. The tool will report ToolResult::success even though nothing was installed.
Consider also checking for applied: false or treating the absence of an explicit success indicator as a distinguishable outcome (not necessarily an error, but worth surfacing clearly to the LLM so it doesn't tell the user "update complete" when nothing happened).
[question] Self-restart mid-conversation
When update_run triggers a self_replace restart, the core process exits. Does the Tauri core_rpc_relay handle the disconnect gracefully, or will the user see an unhandled RPC error in the conversation? Not a blocker for this PR — the behavior exists today via the Settings path — but worth a note in the tool description or a follow-up issue since the in-chat path makes it more likely users will be mid-conversation when it happens.
Nit: update_apply_tests.rs:3 — use std::sync::Arc is redundant since super::* already brings it into scope.
LGTM with the minor logging/detection items above. Nice work keeping the tool thin and reusing the existing update domain end-to-end.
Hand-rolling `can_act` + `record_action` inside `UpdateApplyTool` left the tool's autonomy + rate-limit handling out of step with every other act-level tool (memory_store, memory_forget, composio.execute, delegate) — different error strings made grepping production logs harder, and any future gate added to `SecurityPolicy::enforce_tool_operation` (supervised-approval, token budget, etc.) would have silently bypassed this tool. Route through the shared enforcer with `ToolOperation::Act` and re-export `ToolOperation` from `openhuman::security` so callers don't have to reach into `policy`. Test updated to assert against the canonical `read-only mode` / `update_apply` phrasing from the shared path.
`update_run`'s soft-failure paths ("already current", "no platform
asset for this target", "download/stage failed") return
`applied: false` with a descriptive `message` and no `error` key. The
old `outcome.value.get("error").is_some()` check thus mapped every
one of those to `ToolResult::success`, so the LLM would tell the user
"update applied" even though nothing was installed. Read `applied`
explicitly and treat any non-applied outcome as an error so the
orchestrator surfaces the real status. Adds an
`[update_apply] execute done` debug log carrying the applied /
has_error_key / is_error flags so the resolution is visible without
re-running.
…e_check]` `update_check` already logged on entry but not on exit, so when the underlying release-feed RPC errored the only trace lived on the `update_check` target inside the `outcome.logs` loop — invisible to a plain `[update_check]` grep at the tool layer. Add a `[update_check] execute done` line that surfaces `is_error` and the response body size, and explain inline why the `is_error` check intentionally only trips on an explicit `error` key (read-only checks must keep treating "no update available" as a happy answer, unlike `update_apply`).
`update.check` was tagged `DIAGNOSTICS_TO_BACKEND` and `update.apply` was tagged `None`, but neither flow talks to the OpenHuman backend — both hit GitHub Releases directly to discover release metadata and fetch the platform asset. That under-reported the outbound destination in the capability catalog, which exists to be the source of truth for "where does my data go" surfaces. Introduce a `GITHUB_RELEASES_METADATA` constant (`leaves_device: true`, `data_kind: Metadata`, `destinations: ["GitHub Releases"]`) and apply it to both update capabilities so the catalog reflects the real network destination.
|
Thanks for the review — pushed fixes for everything actionable.
Skipped the |
senamakel
left a comment
There was a problem hiding this comment.
one failing test do have a look
`update_apply_rejects_when_rpc_mutations_disabled` failed in CI
because two sibling tests (`update_apply_rejects_non_github_url_before_network_call`,
`update_apply_rejects_unsafe_asset_name`) called `update_apply` without
holding `TEST_ENV_LOCK`. `update_apply` reads the mutation-policy
config via `OPENHUMAN_WORKSPACE`, which is process-global, so a
sibling running on another thread could clobber the env var between
the disabled test's `WorkspaceEnvGuard::set` and the policy load
inside `update_apply`. The disabled test then loaded a default policy
(where `rpc_mutations_enabled = true`), the gate passed, and its
`outcome.value["error"].contains("rpc_mutations_enabled=false")`
assertion failed.
Take `TEST_ENV_LOCK` in both sibling tests so all three serialise on
the same mutex. The validation-only tests still skip mock setup; they
just no longer race against the policy-gated case.
|
@senamakel CI flake fixed in 8d8c0c8 |
Summary
update_checkandupdate_applyso the user can ask "am I up to date?" / "update OpenHuman" in chat instead of opening Settings → Developer Options.openhuman.update_*RPC surface — no duplicate release URL, asset validation, or staging logic; staging stays insrc/openhuman/update/.update_applyis gated three ways: tool-leveluser_confirmed: truearg (LLM must explicitly callask_user_clarificationfirst), tool-level autonomy check (SecurityPolicy::can_act), and the existingconfig.update.rpc_mutations_enabledpolicy gate insideupdate_run.Problem
src/openhuman/update/already implementsversion,check,apply, andrunas JSON-RPC controllers, and the frontend uses them through the Settings → Developer Options UI. The orchestrator's tool list (src/openhuman/agent/agents/orchestrator/agent.toml) carried no update tools, so the agent could not initiate a check or staged upgrade in chat even though every primitive existed at the core layer.Solution
src/openhuman/tools/impl/system/update_check.rs— read-only, closed object schema, callsupdate::rpc::update_check.PermissionLevel::ReadOnly.src/openhuman/tools/impl/system/update_apply.rs—PermissionLevel::Dangerous, requiresuser_confirmed: true, runs the autonomy gate (SecurityPolicy::can_act+record_action), then delegates toupdate::rpc::update_run. The downstream RPC re-checksconfig.update.rpc_mutations_enabledand applies the configuredrestart_strategy.tools::ops::all_tools_with_runtimeso they ship with the default registry.[tools] namedblock (the only agent that needs them today).("update", &["update_check", "update_apply"])row totools::user_filterso the onboarding tool toggles can opt them in/out.update.check/update.applycapability entries inabout_app::catalogto mention the in-chat path and the consent gate, so the capability surface stays the source of truth.The
update_applytool description tells the LLM: confirm viaask_user_clarification, then call again withuser_confirmed: true. That keeps the consent decision visible in the conversation transcript instead of buried in a tool argument.Submission Checklist
diff-cover) meet the gate enforced by.github/workflows/coverage.yml. 11 new tests cover both tools' name/permission/schema, the consent gate (missing arg, false arg), the autonomy gate (read-only mode), and the gate ordering (consent first).## Relateddocs/RELEASE-MANUAL-SMOKE.md) — N/A: orchestrator-only change, Settings flow unchanged.Closes #NNNin the## RelatedsectionImpact
update_checkdoes the same single GitHub Releases request that the existing RPC has always done.Related
update.check,update.apply(descriptions extended inabout_app::catalog).AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
Summary by CodeRabbit
New Features
Tests