English | 中文
A deterministic Android control layer for AI agents.
Autofish gives your agent a stable, explicit, and observable interface to Android devices. It does not plan tasks, orchestrate workflows, or make decisions — that is your agent's job. Autofish handles the last mile: seeing the screen, executing actions, verifying outcomes, and recovering from failures.
human → agent → autofish → Android device
↑ |
└── observe / verify ───┘
Multi-agent orchestration is a rapidly evolving discipline. How agents coordinate, plan, retry, and delegate will converge on best practices and dedicated platforms — much like microservice orchestration converged on Kubernetes. Device control is a different concern and should not be entangled with it.
When planning and execution are fused, the agent loses authority over what actually happens, failures get absorbed instead of surfaced, and adopting a better orchestration strategy means rewriting the entire stack.
Autofish keeps the boundary clean: device control is infrastructure, not application logic. Your agent — or whatever orchestration layer emerges — keeps full authority over planning and decision-making. Autofish provides deterministic device operations, explicit failure signals, and a queryable execution record. When the orchestration story matures, Autofish stays the same.
| Principle | What it means in practice |
|---|---|
| Single responsibility | Android observation and control. No planning, no LLM calls, no workflow engine. |
| Agent-first interface | The CLI and service API are shaped for machine consumption: structured output, deterministic exit codes, composable commands. |
| Deterministic surface | Commands do exactly what they say. No implicit retries, no "smart" heuristics behind the scenes. |
| Closed-loop discipline | Every action must be preceded by observation and followed by verification. The system enforces this through its memory model. |
| Graceful degradation | Multiple execution paths coexist. When preferred capabilities are unavailable, fallback paths keep the system operational. |
| Observable execution | Every action, state transition, and recovery is recorded with structured context — queryable by the agent mid-run. |
Autofish has two components:
Android Service — A foreground service running on the device, exposing an HTTP API. It handles screenshots, UI tree inspection, tap/swipe/text input execution, and on-screen overlay annotations.
CLI (af) — A Rust binary that talks to the service over HTTP. It provides the command interface, manages local tool memory (SQLite), and handles artifact storage. Distributed via npm (@memohjs/af).
af observe page --field screen --field refs --max-rows 80
af act tap --by ref --value @n3
af verify text-contains --text "Wi-Fi"
af memory save --app com.android.settings --topic "nav/home-to-wifi" \
--content "Tap @n3 (Wi-Fi) from main settings page"The CLI maintains a local SQLite database that tracks:
- Events — Every
act,verify, andrecoverwith page fingerprint, status, failure cause, and duration. - Transitions — Automatically closed act → verify pairs with success/failure counters.
- Recovery strategies — Which recovery steps worked for which failure causes.
- Agent notes — Append-only knowledge your agent writes and queries.
- Quickstart — From zero to first command
- CLI Reference — All commands, flags, and output formats
- Architecture — Service internals, refs design, API contracts
- CLI Memory — Data model, recording rules, observation cache
- Changelog — Release history and breaking changes
Requirements: JDK 17, Android SDK API 36, Rust toolchain, just.
just build
just install