fix(parse): AssetResolver/IO input hardening (SEC-3/4/5)#25
Merged
Conversation
…SEC-5) parseFile transparently inflates gzip-magic input, but inflateGzip grew its output buffer with an unbounded `cap *= 2` loop and seeded the initial reservation from the gzip ISIZE footer — both attacker-controlled. A tiny "decompression bomb" (small compressed payload that expands to gigabytes) or a forged ~4 GiB ISIZE could drive unbounded heap growth into an OOM DoS, with no ceiling anywhere in the parse layer. Add kMaxDecompressedBytes (RecursionLimits.hpp, default 256 MiB — far above any legitimate X3D payload) and an inflateGzip `maxOut` parameter defaulting to it (mirrors MeshBuildOptions.maxWalkVisits so an embedder on a tight memory budget can lower it). inflate now clamps the initial reservation to the ceiling and throws when the grow loop would exceed it, instead of doubling toward OOM. TDD: inflate_bomb_test builds a stored-block gzip and asserts output past the cap throws while a legitimate stream still inflates; the real gzip round-trip (reader_test) and the parse suite are unaffected.
…y (SEC-3, SEC-4) The default localFileProtoResolver / localFileInlineResolver built a path with `baseUrl + "/" + url` and read it verbatim — an explicit pass-through for absolute paths and no normalization of `..`. A hostile scene with `<Inline url='/etc/passwd'/>` or `url='../../../../etc/passwd'` made the parser read arbitrary files on load (SEC-3). The same resolvers keyed their thread_local cycle guard on that raw string, so two spellings of one file (`./a.x3d` vs `a.x3d` vs `sub/../a.x3d`) aliased past the guard and re-expanded (SEC-4). Add confineLocalIncludePath (runtime/parse/PathConfine.hpp): reject absolute urls, canonicalize `baseDir/url` with std::filesystem::weakly_canonical (resolves `.`/`..`/symlinks even when the target doesn't exist), and confine the result to a root. Both resolvers use its result as the file path AND the cycle-guard key — the canonical form de-aliases spellings, so SEC-4 falls out of the SEC-3 fix. Confinement root is CONFIGURABLE with a secure default (ADR-0038). Real X3D content uses `../` cross-directory EXTERNPROTO refs (the Savage corpus shares prototype libraries via ../../Tools/...), so a per-file-subtree confinement would drop legitimate content — caught by the conformance gate. parseFile(path) and a direct parseDocument default the root to the source file's own directory (tight, secure for untrusted input); parseFile(path, confineRoot) / parseDocument(..., confineRoot) let a tool parsing a TRUSTED tree widen it so `../` within the tree resolves while absolute paths and escapes above the root stay blocked. The root is set by the outermost parse and held in a thread_local that nested resolver re-parses inherit (one stable root, not narrowed per hop). The conformance CLI gate (tools/x3d-cli/cli_gate.cpp) parses the corpus as a trusted tree (root = corpus root) on both validate and convert-roundtrip sides, so the Savage `../../Tools` EXTERNPROTOs resolve symmetrically — gate stays green. TDD: path_confine_test asserts absolute/escape-above-root rejected, in-tree includes (incl. `../` under a broad root) resolve, and spelling variants share one canonical key. Full suite + ASan/UBSan (453 ctests) green. Docs (in-diff): ADR-0038 + coverage.md row/count + mkdocs nav; parse-readers Input-hardening section and test list extended (also covers SEC-5).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
AssetResolver/IO input-hardening — continues the SEC-1/2 + MEM-1 line. Three parse-layer findings, one commit each, TDD per finding.
inflateGzipgrew its output buffer with an unboundedcap *= 2loop and seeded the initial reservation from the attacker-controlled gzip ISIZE footer → a tiny bomb or a forged ~4 GiB ISIZE could OOM the process. AddedkMaxDecompressedBytes(256 MiB default) + an embedder-tunablemaxOutparam; clamps the initial reservation and throws when the grow loop would exceed the ceiling. TDD:inflate_bomb_test.baseUrl + "/" + urland read it verbatim (absolute pass-through, no..normalization), sourl='/etc/passwd'/url='../../etc/passwd'read arbitrary files on load.confineLocalIncludePath(PathConfine.hpp) now rejects absolute urls, canonicalizes, and confines to a root../a.x3dvsa.x3daliased the same file past the guard. The canonical path from SEC-3's helper is now the key, so spelling variants collapse to one.Confinement policy — configurable root, secure default (ADR-0038)
A strict per-source-subtree confinement broke real content: the Savage example corpus shares prototype libraries via
../../Tools/Authoring/…cross-directory EXTERNPROTOs, which the conformance gate flagged immediately. So the root is a parameter:parseFile(path)confines to the source file's own directory: absolute urls and../cross-directory reads are rejected.parseFile(path, confineRoot)/parseDocument(…, confineRoot)widen the root so../references that stay within the tree resolve; absolute paths and escapes above the root stay blocked. The conformance CLI gate parses the corpus this way (root = corpus root), on both validate and convert-roundtrip sides, so the Savage protos resolve and the gate stays green.The root is set by the outermost parse and held in a
thread_localthat nested resolver re-parses inherit (one stable root, not narrowed per hop).SEC-6 (curl SSRF / response caps) is a separate follow-up — it needs the curl/vcpkg build and a security-policy ADR of its own.
Docs (anti-drift, in-diff)
ADR-0038 + coverage.md row/count + mkdocs nav; parse-readers "Input-hardening" section and test list extended.
Verification
mise run cigreen (453 ctests, golden/conformance/coverage, cli-gate PASS — no regressions); full ASan/UBSan suite 453/453; strictdocs-buildgreen. TDD red→green for each finding (inflate_bomb_test,path_confine_test).