fix: bound graph-walk fan-out against the doubling-DAG DoS (#21)#23
Merged
Conversation
Approved design for the doubling-DAG DoS follow-up: a WalkBudget node-visit cap on the per-path emitters (extractor/pick/light) with a graceful budgetExceeded signal, plus visited-set memoization for the worldOfRec search. Refs #21
… DoS (#21) The MEM-1 cycle+depth guards are crash-proof but not DoS-proof: an acyclic 'doubling DAG' (G0->[G1,G1], G1->[G2,G2], ...) has no back-edge and stays shallow, yet its leaf is reachable via 2^depth distinct paths, so the per-path graph walks fan out exponentially (CPU/memory DoS). The fix differs by walker semantics: - worldOf is a first-found SEARCH -> visited-set memoization. Pruning already-explored subtrees never changes the first-found result but drops O(paths) to O(nodes), so it SUCCEEDS on a legitimately wide DAG. - SceneExtractor::walk, LightSystem::walk and PickSystem::pickNode emit/test PER PATH (intentional for USE-instancing), so they cannot memoize. Add a WalkBudget node-visit cap (RecursionLimits.hpp, default kMaxGraphWalkVisits = 1e6, overridable via MeshBuildOptions.maxWalkVisits). On exhaustion the walk stops and latches a graceful, non-throwing signal: SceneExtractor:: budgetExceeded() (one budget shared across the snapshot's light collection + geometry walk) and PickResult.budgetExceeded. Tests (walker_budget_test.cpp): a 2^45 DAG makes worldOf(absent) terminate instantly (vs non-terminating pre-fix) and worldOf(present) still returns the correct transform; a 2^22 DAG trips the extractor/light/pick budgets and bounds the emitted item count; a small scene never trips it. Codegen-free; golden unaffected. Refs #21
Living docs in the same change as the #21 fix (anti-drift): - ADR-0037: the traversal-budget decision (WalkBudget cap on the per-path emitters + worldOf memoization), extending ADR-0016 from crash-proof to DoS-bounded. Embedder-visible contract -> binding ADR. - coverage.md: ADR-0037 row + count 36->37 (coverage-gate green) + narrative. - mkdocs.yml: ADR-0037 nav entry (docs-build --strict green). - extract.md: budgetExceeded()/maxWalkVisits in the exposed interface + ADR link. Refs #21
…view) Adversarial review caught a real correctness bug: worldOfRec marked a node visited at entry, but a node first reached near the MEM-1 depth cap has its subtree truncated (children at depth>=1000 return without exploring). Marking it then wrongly pruned a SHALLOWER path on which the target IS findable -> worldOf returned identity instead of the true transform (a bounded-wrong camera/viewpoint frame on an adversarial ~1000-deep graph; the prior "provably not under it" comment was false). Fix: propagate an "incomplete" flag and memoize a node ONLY when its subtree was fully explored (no descendant hit the depth cap). The wide-DAG fast path is unchanged (a shallow doubling DAG never truncates). New regression worldof_depth_truncated_subtree_is_not_falsely_memoized pins it. Also from review (minor): correct the extractor-budget test's decorative itemCount assertion -- the doubling DAG reuses node pointers so all paths share one PathKey and emit() interns to <=1 item (light collection, sharing the budget, can consume it first); budgetExceeded() is the load-bearing signal. Clarify the kMaxGraphWalkVisits comment (counts node-visits, shared across light+geometry, not RenderItems). Refs #21
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bound graph-walk fan-out against the doubling-DAG DoS
Closes #21
Follow-up to the MEM-1 hardening (#20): the cycle+depth guards made the
extractor/pick/light walks crash-proof, but not DoS-proof against a
legitimate acyclic graph that fans out exponentially.
A "doubling DAG" —
G0 children [G1,G1],G1 children [G2,G2], … ~30 deep, aShape at the leaf — has no containment back-edge (so
breakContainmentCyclesleaves it intact) and stays far under the depth cap, yet the leaf is reachable
via
2^30distinct paths. Per-path traversal then explodes into a CPU/memory DoS.The fix differs by walker semantics
Per-path emission is intentional for USE-instancing (a node USE'd under N
placements legitimately yields N RenderItems/LightDescs/pick candidates, each in
its own world frame), so the emitters cannot be memoized — they must be
bounded.
worldOfis the exception: a first-found search, fixed bymemoization.
SceneExtractor::walkWalkBudgetnode-visit cap →budgetExceeded()LightSystem::walkbudgetExceeded()PickSystem::pickNodeWalkBudget(cap viapickClosest(..., maxVisits)) →PickResult.budgetExceededPickSystem::worldOfRecWalkBudget+kMaxGraphWalkVisits(default1'000'000, overridable viaMeshBuildOptions.maxWalkVisits) live inruntime/RecursionLimits.hpp. Overflowis a graceful, non-throwing signal (the extractor/pick are read paths), so a
capped walk returns a partial result plus the flag for an embedder to react.
Definition of Done
worldOf(absent)on a 2⁴⁵ DAG is a runtime RED (non-terminatingpre-fix; instant post-memoization). The budget cases assert
budgetExceeded+bounded
itemCount(); a small scene never trips the cap.mise run cigreen (build/golden/conformance/coverage/cli-gate).extends ADR-0016 from crash-proof to DoS-bounded),
coverage.mdrow + count,mkdocs.ymlnav,extract.mdinterface note. Design spec underdocs/superpowers/specs/.NOTICEunchanged.Code review
A fresh-context adversarial review verified the budget machinery (latch/off-by-one,
reset placement, shared-budget threading, MEM-1 ordering) and caught one real
correctness bug, now fixed:
worldOfRecmarked a nodevisitedat entry, buta node first reached near the MEM-1 depth cap has its subtree truncated — marking
it then wrongly pruned a shallower path on which the target was findable, returning
identity. The fix memoizes a node only when its subtree was fully explored (an
incompleteflag propagates depth-truncation up), pinned by a new regression(
worldof_depth_truncated_subtree_is_not_falsely_memoized). Two review minors alsoapplied: the extractor test's decorative
itemCountassertion was corrected (thedoubling DAG reuses node pointers → one interned
PathKey;budgetExceeded()isthe load-bearing signal), and the cap comment now states it counts node-visits
(shared across light + geometry), not RenderItems.
New test
runtime/scene/tests/walker_budget_test.cpp— worldOf memoization (absentterminates / present correct / depth-truncated subtree not falsely memoized),
extractor + light + pick budget trips on a doubling DAG, and a no-trip guard for
a legitimate small scene.