[CLI] no in-place sandbox restart or reconcile command; operators must kubectl delete pod after every CR edit


`nemoclaw <name>` subcommands cover `connect`, `status`, `logs`, `snapshot create/restore/list`, `rebuild`, `destroy`, `skill install`, `policy-add/list/remove`. None of them reconcile the current CR into the running pod.

`rebuild` is the closest, but it is documented as "Upgrade sandbox to current agent version": heavyweight, touches the image, not intended for "the pod spec just changed, please roll the pod." `destroy` removes the sandbox, too destructive for this purpose. `snapshot restore` rebuilds from a snapshot, not from the live CR.

Result: when the Sandbox CR spec changes (for example, after a `hostAliases` patch that has no CLI surface; see companion issue), the operator's only option is:

```bash
docker exec openshell-cluster-nemoclaw kubectl -n openshell delete pod my-assistant
# wait for the operator to recreate the pod from the updated CR
```

### Environment

- NemoClaw v0.0.17, OpenShell v0.0.26
- Sandbox: `my-assistant` on DietPi2 (x86_64, 64GB, MicroK8s)
- Context: added a new `hostAliases` entry to the CR; needed to force the operator to re-roll the pod

### Expected behaviour

```bash
nemoclaw my-assistant restart [--wait]
# or
nemoclaw my-assistant reconcile [--wait]
```

Delete and recreate the pod (preserving the PVC / memory), wait for Ready, exit 0 once the new pod is serving. Keep snapshots as the heavy-surface backup / rollback primitive.

### Actual behaviour

No such command. Operators who need an in-place pod refresh must `kubectl delete pod` inside the k3s container, which (a) the docs discourage and (b) requires knowing the cluster name and the namespace layout.

### Impact

- CR-edit workflows end with a kubectl detour on the docker-exec path.
- Scripting a "hot-reload policy + DNS" workflow has no supported entrypoint.
- Diagnostic workflows (restart the pod, watch the fresh boot logs) currently need multi-step manual steps.

### Suggested fix

Add `nemoclaw <name> restart` that:

1. Deletes the pod via the operator's authority, not direct kubectl.
2. Waits for the operator to recreate it.
3. Waits for the new pod to reach `Ready`.
4. Optionally prints the boot log tail.
5. Exits non-zero if anything times out.

Accept `--wait-timeout <seconds>` with a sensible default (300s).

### Companion issues

Third in a cluster of three CLI-gap filings:

- #2039: `nemoclaw policy-add` has no custom-preset surface
- #2040: no CLI surface for sandbox `hostAliases` management
- this issue: no in-place sandbox restart / reconcile command

All three describe the same pattern: the supported CLI covers the common cases, but custom-infrastructure edits drop below the supported surface. #2039 covers the egress-policy axis, #2040 covers the DNS axis, and this issue covers the reconcile axis. A workflow that edits the Sandbox CR (for example to add a `hostAliases` entry via #2040's workaround) typically needs #2039's policy-edit and this issue's reconcile-pod step to complete end-to-end.

### Notes

Filed on 2026-04-17 alongside [#2039](https://github.com/NVIDIA/NemoClaw/issues/2039) and [#2040](https://github.com/NVIDIA/NemoClaw/issues/2040).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CLI] no in-place sandbox restart or reconcile command; operators must kubectl delete pod after every CR edit #2041

Environment

Expected behaviour

Actual behaviour

Impact

Suggested fix

Companion issues

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[CLI] no in-place sandbox restart or reconcile command; operators must kubectl delete pod after every CR edit #2041

Description

Environment

Expected behaviour

Actual behaviour

Impact

Suggested fix

Companion issues

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions