Skip to content

ACP: permission replies dropped (tool call hangs) after session/load with a different cwd #31964

@ivanmisuno

Description

@ivanmisuno

Description

After loading an existing ACP session with a different cwd than the one it was created
with (session/new at dir A → session/load same sessionId at dir B), the agent enters a
split-brain state:

  1. Prompts still execute in the context rooted at the creation cwd (dir A) — e.g. a
    read of a file inside the loaded cwd (dir B) triggers an external_directory
    permission request, which only makes sense relative to dir A.
  2. The client's session/request_permission reply is silently dropped: the response is
    delivered on the wire (correct JSON-RPC id, correct
    {"outcome": {"outcome": "selected", "optionId": "always"}} shape), opencode never
    processes it, the tool call never resolves, and session/prompt hangs indefinitely.
    No error is surfaced anywhere. Replying once behaves identically.

The identical ask→reply exchange works fine (reply processed in ~10 ms, tool runs, turn
completes) when the session is used at the cwd it was created with — so this is specific to
the load-with-changed-cwd path, not to permission handling in general.

ACP session/load accepting a cwd parameter suggests rebinding is intended to be
supported; if it isn't, an error from session/load would also be fine — anything but a
silent hang.

Where we believe it breaks (from reading the v1.17.3 source, commit 8c80113):

  • packages/opencode/src/acp/session.ts — ACP-side load stores the session with the new
    cwd, but the server-side session keeps executing in the instance rooted at its creation
    directory.
  • packages/opencode/src/acp/permission.tsHandler.process() forwards the client's
    reply via sdk.permission.reply({requestID, reply, directory: session.cwd}), i.e. the
    new cwd.
  • packages/sdk/js/src/v2/gen/sdk.gen.tspermission.reply sends directory as a query
    param, which workspace-routing / instance-context middleware resolve to the new
    cwd's instance.
  • packages/opencode/src/permission/index.ts — pending permissions live in per-instance
    InstanceState; the reply lands in the wrong instance, pending.get(requestID) misses,
    and Permission.NotFoundError is raised…
  • …and swallowed: Handler.handle() in acp/permission.ts wraps processing in
    .catch(() => {}), so the failure is invisible and the asking fiber's Deferred waits
    forever.

Suggested fix directions (any one of these would resolve it; listed by how targeted they
are):

  1. Reply with the directory the ask originated in rather than the ACP session's current
    cwd — e.g. surface directory on the permission.asked event properties (the server-side
    bus envelope already carries it) and use it in Handler.reply(). This is exactly what
    merged PR fix(tui): route permission replies to session directory #30851 ("fix(tui): route permission replies to session directory") did for the
    TUI prompt — the ACP handler needs the analogous change.
  2. Resolve /permission/:requestID/reply across instances — request IDs are unique, so the
    directory-scoped lookup is only load-bearing in the broken case.
  3. Make ACP session/load actually rebind the server-side session to the new cwd, or reject
    the call loudly when it can't.
  4. Independent hardening: don't .catch(() => {}) around permission processing — log and
    reject the pending tool call so a routing failure fails the turn instead of hanging it.

Related issues (same root-cause class — permission reply landing on a different
instance/directory than the pending ask — on other surfaces; none covers the ACP path):

Plugins

None

OpenCode version

1.17.3 (Homebrew). Worked on pre-ACP-rewrite versions (e.g. 1.2.x); first broken version not bisected — suspected introduced with the ACP "next" implementation promotion (#29929).

Steps to reproduce

Self-contained ACP client (Python 3, stdlib only). It spawns opencode acp, creates a
session in temp dir A, loads it with cwd = dir B, then prompts a read of a file inside B and
auto-replies "always" to the permission request:

#!/usr/bin/env python3
"""Repro: opencode ACP drops permission replies after session/load with a changed cwd.

Usage: python3 repro.py        # broken: new(A) + load(B)  -> hangs
       python3 repro.py direct # control: new(B) only      -> completes
"""
import asyncio, json, os, sys, tempfile
from pathlib import Path

OPENCODE = os.environ.get("OPENCODE_BIN", "opencode")  # beware stale installs on PATH
MODEL = "opencode-go/minimax-m3"  # any working model


async def main():
    direct = len(sys.argv) > 1 and sys.argv[1] == "direct"
    # realpath matters on macOS: /var/folders -> /private/var/folders
    dir_a = os.path.realpath(tempfile.mkdtemp(prefix="repro-a-"))
    dir_b = os.path.realpath(tempfile.mkdtemp(prefix="repro-b-"))
    target = Path(dir_b) / "docs" / "prd.md"
    target.parent.mkdir()
    target.write_text("# Hello\n")

    proc = await asyncio.create_subprocess_exec(
        OPENCODE, "acp",
        stdin=asyncio.subprocess.PIPE, stdout=asyncio.subprocess.PIPE,
    )
    pending = {}

    async def send(msg):
        print("SEND", json.dumps(msg)[:300])
        proc.stdin.write((json.dumps(msg) + "\n").encode())
        await proc.stdin.drain()

    async def request(rid, method, params):
        fut = asyncio.get_event_loop().create_future()
        pending[rid] = fut
        await send({"jsonrpc": "2.0", "id": rid, "method": method, "params": params})
        return await fut

    async def read_loop():
        while line := await proc.stdout.readline():
            msg = json.loads(line)
            print("RECV", json.dumps(msg)[:300])
            if "method" in msg and "id" in msg:  # agent->client request
                if msg["method"] == "session/request_permission":
                    await send({"jsonrpc": "2.0", "id": msg["id"], "result":
                                {"outcome": {"outcome": "selected", "optionId": "always"}}})
            elif "id" in msg and msg["id"] in pending:
                pending.pop(msg["id"]).set_result(msg)

    asyncio.ensure_future(read_loop())

    await request(1, "initialize", {"protocolVersion": 1,
        "clientInfo": {"name": "repro", "title": "Repro", "version": "1.0"},
        "clientCapabilities": {"fs": {"readTextFile": False, "writeTextFile": False},
                               "terminal": False}})
    if direct:
        r = await request(2, "session/new", {"cwd": dir_b, "mcpServers": []})
        sid = r["result"]["sessionId"]
    else:
        r = await request(2, "session/new", {"cwd": dir_a, "mcpServers": []})
        sid = r["result"]["sessionId"]
        await request(3, "session/load", {"sessionId": sid, "cwd": dir_b, "mcpServers": []})
    await request(4, "session/set_config_option",
                  {"sessionId": sid, "configId": "model", "value": MODEL})
    try:
        await asyncio.wait_for(request(5, "session/prompt", {"sessionId": sid, "prompt":
            [{"type": "text", "text": f"Use the read tool to read {target} and reply with "
                                      "its first line. Do nothing else."}]}), 120)
        print("PROMPT COMPLETED")
    except asyncio.TimeoutError:
        print("!!! PROMPT TIMED OUT — hang reproduced")
    proc.terminate()


asyncio.run(main())

Observed transcript of the broken run (trimmed; exactly this script against opencode 1.17.3,
dirs realpath'd so the symlink artifact below is not a factor):

SEND session/new        cwd=/private/var/.../T/repro-a-peg8w3dl             -> ses_1474bbb77ffe...
SEND session/load       sessionId=ses_1474bbb77ffe...  cwd=/private/var/.../T/repro-b-km2vipwu   -> ok
SEND session/prompt     "read /private/var/.../T/repro-b-km2vipwu/docs/prd.md"
RECV tool_call          read (pending), then in_progress
RECV session/request_permission  id=0  title=external_directory
                        filepath=/private/var/.../T/repro-b-km2vipwu/docs/prd.md
                        # NB: the file is INSIDE the loaded cwd — "external" only
                        # relative to the session's CREATION cwd (dir A)
SEND {"jsonrpc":"2.0","id":0,"result":{"outcome":{"outcome":"selected","optionId":"always"}}}
... nothing. 120 s later the client gives up; the tool never executes, the prompt never returns.

Control (direct mode, session created at dir B): the same read asks no permission at
all
and the prompt completes. Pointing the read at a file outside dir B instead (one-line
change) produces the same external_directory ask, and the "always" reply is processed in
~10 ms with the tool then running — so the ask→reply mechanism itself is healthy; only the
load-with-changed-cwd path misroutes the reply.

Also observed in our real client (Zed-style ACP integration) on macOS and in a linux/amd64
container — same wire pattern, including the case of two parallel reads producing permission
requests id 0 and id 1, both answered, both dropped.

Unrelated minor observation noticed while reducing this: if the session/new cwd is
passed un-realpath'd through a symlink (macOS /var/folders/...), reads of in-project files
via their realpath also trigger external_directory — path containment appears to compare a
realpath'd instance directory against the raw tool path. Looks like the macOS flavor of
#27601 (external_directory not resolving symlinked directories).

Screenshot and/or share link

N/A — headless ACP client; full wire transcripts above.

Operating System

macOS 26 (Darwin 25.5.0), arm64; also reproduced in a linux/amd64 container (Cloud Run).

Terminal

N/A (programmatic ACP client over stdio).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions