Skip to content

Audit static Regex API usage against the bounded-regex policy #4072

Description

@Widthdom

Summary

Dogfood review found 397 Regex. API hits across 90 production files. Many are likely calls to the repository's bounded wrapper or generated/precompiled patterns, but the high count makes it worth auditing whether any static/raw BCL regex paths bypass timeout policy.

Evidence

Dogfood command:

dotnet ./src/CodeIndex/bin/Debug/net8.0/cdidx.dll search --recipe dogfood-risk-patterns/static-regex-api --path src/ --exclude-tests --count-by file --limit 140

Top files:

  • SqlReferenceExtractor: 34
  • LanguageReferenceExtractionSupport: 33
  • SymbolExtractor: 25
  • PythonReferenceExtractor: 18
  • RReferenceExtractor: 16
  • SymbolExtractor.JavaScriptTypeScriptSupport: 16
  • SymbolExtractor.CSharpScanner: 13
  • PhpReferenceExtractor: 12
  • RustReferenceExtractor: 11
  • SymbolExtractor.Markup: 10
  • ReferenceExtractor.Core: 9
  • CSharpReferenceExtractor.Support: 8
  • ReferenceExtractor, SymbolExtractor.Go, SymbolExtractor.Php: 7 each

Related positive evidence:

Audit goals

  • Distinguish wrapper-backed CodeIndex.Indexer.BoundedRegex calls from raw BCL static calls.
  • Confirm all untrusted or large-input regex matching has an explicit timeout or shared bounded policy.
  • Confirm generated/precompiled or fixed small-input patterns are documented as safe.
  • Add a command or analyzer-friendly convention if the alias makes audits ambiguous.

Acceptance criteria

  • Static regex hits are classified as bounded-wrapper, raw BCL with timeout, generated/precompiled, trusted small input, or fix-needed.
  • Fix-needed raw BCL calls are moved to bounded helpers or explicit timeout overloads.
  • Tests cover at least one timeout/large-input behavior path if code changes are needed.
  • Documentation or code comments clarify the intended bounded-regex convention.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions