Skip to content

fix(parse): AssetResolver/IO input hardening (SEC-3/4/5)#25

Merged
delta9000 merged 2 commits into
mainfrom
sec/io-input-hardening
Jun 27, 2026
Merged

fix(parse): AssetResolver/IO input hardening (SEC-3/4/5)#25
delta9000 merged 2 commits into
mainfrom
sec/io-input-hardening

Conversation

@delta9000

@delta9000 delta9000 commented Jun 27, 2026

Copy link
Copy Markdown
Owner

AssetResolver/IO input-hardening — continues the SEC-1/2 + MEM-1 line. Three parse-layer findings, one commit each, TDD per finding.

  • SEC-5 — gzip decompression-bomb cap. inflateGzip grew its output buffer with an unbounded cap *= 2 loop and seeded the initial reservation from the attacker-controlled gzip ISIZE footer → a tiny bomb or a forged ~4 GiB ISIZE could OOM the process. Added kMaxDecompressedBytes (256 MiB default) + an embedder-tunable maxOut param; clamps the initial reservation and throws when the grow loop would exceed the ceiling. TDD: inflate_bomb_test.
  • SEC-3 — Inline/EXTERNPROTO path-traversal + absolute-file read. The default resolvers built baseUrl + "/" + url and read it verbatim (absolute pass-through, no .. normalization), so url='/etc/passwd' / url='../../etc/passwd' read arbitrary files on load. confineLocalIncludePath (PathConfine.hpp) now rejects absolute urls, canonicalizes, and confines to a root.
  • SEC-4 — defeatable cycle guard. The cycle key was the raw path string, so ./a.x3d vs a.x3d aliased the same file past the guard. The canonical path from SEC-3's helper is now the key, so spelling variants collapse to one.

Confinement policy — configurable root, secure default (ADR-0038)

A strict per-source-subtree confinement broke real content: the Savage example corpus shares prototype libraries via ../../Tools/Authoring/… cross-directory EXTERNPROTOs, which the conformance gate flagged immediately. So the root is a parameter:

  • Default (untrusted input)parseFile(path) confines to the source file's own directory: absolute urls and ../ cross-directory reads are rejected.
  • Trusted content treeparseFile(path, confineRoot) / parseDocument(…, confineRoot) widen the root so ../ references that stay within the tree resolve; absolute paths and escapes above the root stay blocked. The conformance CLI gate parses the corpus this way (root = corpus root), on both validate and convert-roundtrip sides, so the Savage protos resolve and the gate stays green.

The root is set by the outermost parse and held in a thread_local that nested resolver re-parses inherit (one stable root, not narrowed per hop).

SEC-6 (curl SSRF / response caps) is a separate follow-up — it needs the curl/vcpkg build and a security-policy ADR of its own.

Docs (anti-drift, in-diff)

ADR-0038 + coverage.md row/count + mkdocs nav; parse-readers "Input-hardening" section and test list extended.

Verification

mise run ci green (453 ctests, golden/conformance/coverage, cli-gate PASS — no regressions); full ASan/UBSan suite 453/453; strict docs-build green. TDD red→green for each finding (inflate_bomb_test, path_confine_test).

…SEC-5)

parseFile transparently inflates gzip-magic input, but inflateGzip grew its
output buffer with an unbounded `cap *= 2` loop and seeded the initial
reservation from the gzip ISIZE footer — both attacker-controlled. A tiny
"decompression bomb" (small compressed payload that expands to gigabytes) or a
forged ~4 GiB ISIZE could drive unbounded heap growth into an OOM DoS, with no
ceiling anywhere in the parse layer.

Add kMaxDecompressedBytes (RecursionLimits.hpp, default 256 MiB — far above any
legitimate X3D payload) and an inflateGzip `maxOut` parameter defaulting to it
(mirrors MeshBuildOptions.maxWalkVisits so an embedder on a tight memory budget
can lower it). inflate now clamps the initial reservation to the ceiling and
throws when the grow loop would exceed it, instead of doubling toward OOM.

TDD: inflate_bomb_test builds a stored-block gzip and asserts output past the
cap throws while a legitimate stream still inflates; the real gzip round-trip
(reader_test) and the parse suite are unaffected.
…y (SEC-3, SEC-4)

The default localFileProtoResolver / localFileInlineResolver built a path with
`baseUrl + "/" + url` and read it verbatim — an explicit pass-through for
absolute paths and no normalization of `..`. A hostile scene with
`<Inline url='/etc/passwd'/>` or `url='../../../../etc/passwd'` made the parser
read arbitrary files on load (SEC-3). The same resolvers keyed their thread_local
cycle guard on that raw string, so two spellings of one file (`./a.x3d` vs
`a.x3d` vs `sub/../a.x3d`) aliased past the guard and re-expanded (SEC-4).

Add confineLocalIncludePath (runtime/parse/PathConfine.hpp): reject absolute
urls, canonicalize `baseDir/url` with std::filesystem::weakly_canonical
(resolves `.`/`..`/symlinks even when the target doesn't exist), and confine the
result to a root. Both resolvers use its result as the file path AND the
cycle-guard key — the canonical form de-aliases spellings, so SEC-4 falls out of
the SEC-3 fix.

Confinement root is CONFIGURABLE with a secure default (ADR-0038). Real X3D
content uses `../` cross-directory EXTERNPROTO refs (the Savage corpus shares
prototype libraries via ../../Tools/...), so a per-file-subtree confinement would
drop legitimate content — caught by the conformance gate. parseFile(path) and a
direct parseDocument default the root to the source file's own directory (tight,
secure for untrusted input); parseFile(path, confineRoot) / parseDocument(...,
confineRoot) let a tool parsing a TRUSTED tree widen it so `../` within the tree
resolves while absolute paths and escapes above the root stay blocked. The root
is set by the outermost parse and held in a thread_local that nested resolver
re-parses inherit (one stable root, not narrowed per hop).

The conformance CLI gate (tools/x3d-cli/cli_gate.cpp) parses the corpus as a
trusted tree (root = corpus root) on both validate and convert-roundtrip sides,
so the Savage `../../Tools` EXTERNPROTOs resolve symmetrically — gate stays green.

TDD: path_confine_test asserts absolute/escape-above-root rejected, in-tree
includes (incl. `../` under a broad root) resolve, and spelling variants share
one canonical key. Full suite + ASan/UBSan (453 ctests) green.

Docs (in-diff): ADR-0038 + coverage.md row/count + mkdocs nav; parse-readers
Input-hardening section and test list extended (also covers SEC-5).
@delta9000 delta9000 marked this pull request as ready for review June 27, 2026 13:29
@delta9000 delta9000 merged commit 20914af into main Jun 27, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant