AgentBoundary v0.1 conformance evaluation of AGT — pre-publication review

Hi @imran-siddique and AGT team —

Following up on the conversation in #302 about PolicyDecision schema interop and drop-in compatibility across backends (APS, AGT, YAML), I wanted to share a piece I've been building in parallel and give the AGT team a 7-day right-to-respond window before publication.

I run JamJet Labs and have been authoring an open spec for AI-action receipts called **AgentBoundary** (`jamjet-labs/agentboundary`, v0.1 stable + v0.2-alpha draft). Where the PolicyDecision interop work focuses on the *decision* surface, AgentBoundary focuses on the downstream *receipt* surface — a portable, tamper-evident JSON record of what the action turned out to be, that a third party can verify without trusting the runtime.

I built a 40-scenario conformance suite and graded it against four prominent agent-governance products including AGT. AGT scored highest of the four.

**What I did:**

- Read `docs/specs/AUDIT-COMPLIANCE-1.0.md` and `docs/ARCHITECTURE.md`
- Built an adapter at [`adapters/microsoft-agt/`](https://github.com/jamjet-labs/agentboundary/tree/main/adapters/microsoft-agt) that translates AGT `AuditEntry` (+ optional `DecisionBOM` + workflow approval event) into an AgentBoundary v0.2-alpha receipt
- Ran all 40 conformance scenarios against adapter-translated receipts
- Per-scenario verdicts in [`results.md`](https://github.com/jamjet-labs/agentboundary/blob/main/adapters/microsoft-agt/results.md); field-by-field mapping in [`mapping.md`](https://github.com/jamjet-labs/agentboundary/blob/main/adapters/microsoft-agt/mapping.md)

**Headline:**

```
PASS         17
PARTIAL       5
DOCS-ONLY     1
NOT COVERED  15
N/A           2
──────────────
TOTAL        40
```

The 15 NOT COVERED rows reflect AGT-side schema gaps where the `AuditEntry` doesn't carry data AgentBoundary requires:

- No normative `arguments_hash` (mutation defense)
- No approver identity in the audit row (approval-chain verification)
- No policy version field (downgrade defense)
- Single timestamp per entry (no issued-vs-completed split)
- No `environment` field (prod/staging/dev distinction)

Each maps to a known design choice on AGT's side — not bugs, but deliberate scoping. The framing in my report is that AGT and AgentBoundary's design centres are complementary: runtime enforcement + decision-lineage reconstruction (AGT) vs portable third-party verification of receipts (AgentBoundary). Two different layers; same compliance picture.

**Two things AGT does better than AgentBoundary v0.2-alpha** — both v0.3 adoption candidates on my side:

1. **Merkle chain across actions** (`previous_hash`). v0.2-alpha is singly-linked (`prior_receipt`); weaker against arbitrary-entry-reordering attacks. AGT's approach is structurally stronger.
2. **`DecisionBOM.completeness_score` with per-`BOMField` reconstruction confidence.** v0.2-alpha has a coarser three-tier provenance enum (`observed` / `inferred` / `synthesized`). A numeric confidence per field is meaningfully richer.

Both feed back into the PolicyDecision interop work cleanly — if AGT entries can map round-trip to AgentBoundary receipts (and vice versa), the receipt format becomes another axis of the drop-in replacement story #302 was about.

**The ask:** if any per-scenario mapping or factual claim is wrong, corrections are welcome via this issue or via PR to `jamjet-labs/agentboundary` within 7 days of this post. After that, the report publishes with the data as currently mapped.

Happy to discuss here or directly. The draft report is private until publication; happy to share §7.4 (the AGT section, ~600 words) for a sneak look if either of you wants one.

Thanks for shipping AGT — it raised the bar for everyone in this space.

— Sunil


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AgentBoundary v0.1 conformance evaluation of AGT — pre-publication review #2449

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

AgentBoundary v0.1 conformance evaluation of AGT — pre-publication review #2449

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions