Skip to content

Replace Queue with mutable doubly-linked EventList#3331

Merged
nojaf merged 38 commits into
fsprojects:mainfrom
nojaf:rethink-events
Apr 3, 2026
Merged

Replace Queue with mutable doubly-linked EventList#3331
nojaf merged 38 commits into
fsprojects:mainfrom
nojaf:rethink-events

Conversation

@nojaf

@nojaf nojaf commented Apr 3, 2026

Copy link
Copy Markdown
Contributor

Rethink writer events: mutable doubly-linked list with trivia-aware formatting

Summary

This PR replaces the immutable Queue<WriterEvent> with a mutable doubly-linked list (EventList) and introduces trivia-aware indentation/unindentation. This architectural change enables in-place event splicing, which fundamentally fixes a long-standing class of bugs where comments near closing brackets, at the end of expressions, or after control flow constructs would lose their indentation.

Motivation

The previous architecture had two core problems:

  1. The trivia-before-unindent problem: When leaveNode emits a trailing comment followed by WriteLineBecauseOfTrivia, the subsequent unindent changes the indent level after the trivia newline has already used the old level. The closing bracket then lands at the wrong indentation. This affected ~45 call sites of expressionExceedsPageWidth and every use of indentSepNlnUnindent.

  2. Speculative execution with rollback: Functions like expressionExceedsPageWidth, colWithNlnWhenItemIsMultiline, and genMultilineInfixExpr try a short layout, check if it fits, and fall back to a long layout. With immutable queues this worked via separate copies. With a shared mutable list, these paths need explicit backup/rollback -- which is now both cheaper (O(1) truncation vs O(k) re-creation) and more explicit.

Key changes

New data structures:

  • EventList.fs: Mutable doubly-linked list with Append, InsertBefore, InsertAfter, Remove, CreateBackupPoint/RollbackTo, CurrentLineContent, ToSeq/ToRevSeq
  • WriterModel.Lines: string list replaced by WriterModel.LineCount: int -- metadata-only tracking, no string building during formatting
  • dump walks the DLL with a StringBuilder for deferred string materialization

New WriterEvent cases:

  • WriteTrivia of trivia: string -- distinguishes trivia text (comments, directives, XML docs) from code text, replacing fragile string-prefix checks
  • Start / Placeholder -- DLL position markers for future colWithNlnWhenItemIsMultiline rework

Trivia-aware indentation:

  • unindentWithTriviaAwareness -- splices UnIndentBy before trailing trivia newlines so closing brackets use the reduced indent level
  • indentSepNlnWithTriviaAwareness -- splices IndentBy before trailing trivia so comments between a function and its indented arguments appear at the correct indent
  • indentSepNlnUnindent now uses both, making all 66 call sites automatically trivia-aware
  • LongExpressionLayout DU (IndentAndUnindent, DoubleIndentAndUnindent, NewlineOnly) replaces 4 opaque Context -> Context parameters in expressionExceedsPageWidth

Trivia assignment improvements:

  • findNodeBeforeWithMatchingColumn -- column-aware trivia assignment for indented comments between sibling nodes
  • Closing delimiter heuristic -- when predecessor and successor are at the same column, prefer the predecessor only when the successor is a closing delimiter (], }, |}, ), |))
  • CommentOnSingleLineWithLeadingNewlines -- promotes adjacent Newline + CommentOnSingleLine trivia into a single unit so blank lines before comments are preserved
  • promoteNewlinesBeforeComments pre-processing step with adjacency checking

Speculative formatting fixes:

  • shortExpressionWithFallback and expressionExceedsPageWidth use CreateBackupPoint/RollbackTo on the fallback path
  • colWithNlnWhenItemIsMultiline replay path uses rollback before replaying
  • genMultilineInfixExpr match clause speculation uses rollback
  • WithDummy encapsulates the full backup/rollback lifecycle -- callers just pass the probe function

Separator improvements:

  • sepNlnUnlessLastEventIsNewline used in colWithNlnWhenItemIsMultiline, genOpenList, genArrayOrList, genMultilineFunctionApplicationArguments, and other places to prevent double newlines when trivia already produced a newline
  • hasBlankLineBeforeLastWrite replaces newlineBetweenLastWriteEvent with a simpler DLL walk that counts WriteLineBecauseOfTrivia alongside WriteLine
  • addFinalNewline uses sepNlnUnlessLastEventIsNewline to prevent trailing blank lines when the file ends with a comment

Test results

2835 passing, 0 failing, 11 skipped (7 pre-existing + 4 edge cases documented in design docs).

65+ new/updated tests including regression tests for all fixed issues and unit tests for EventList, dump, trivia-aware helpers, speculative formatting, and colWithNlnWhenItemIsMultiline.

Benchmark

BenchmarkDotNet v0.14.0, macOS 26.3.1 (25D2128) [Darwin 25.3.0]
Apple M4 Max, 1 CPU, 16 logical and 16 physical cores
.NET SDK 10.0.100
  [Host]     : .NET 10.0.0 (10.0.25.52411), Arm64 RyuJIT AdvSIMD DEBUG
  DefaultJob : .NET 10.0.0 (10.0.25.52411), Arm64 RyuJIT AdvSIMD

Before

Method Mean Error StdDev Rank Gen0 Gen1 Allocated
Format 91.53 ms 0.551 ms 0.488 ms 1 800.0000 400.0000 310.83 MB

After

Method Mean Error StdDev Rank Gen0 Gen1 Allocated
Format 85.87 ms 0.690 ms 0.646 ms 1 1000.0000 400.0000 284.82 MB

Known limitations

  • Trailing trivia inflating expression width: Comments assigned as ContentAfter on the last item make the expression appear multiline in speculative width checks, occasionally forcing unnecessary multiline layout (e.g. Elmish expressions). See Trivia Assignment.
  • Cramped record comment column: In cramped bracket style, comments after the last field appear at the regular indent level rather than the atCurrentColumn field level. Needs investigation of genRecordFieldNameCramped scoping.
  • Comment ordering with #if/#endif: When a comment follows #endif before a closing bracket, findNodeBeforeWithMatchingColumn can match across directive boundaries, reversing source order. See Trivia Assignment.

Fixes #2286
Fixes #1864
Fixes #1716
Fixes #2606
Fixes #2653
Fixes #2482
Fixes #2476
Fixes #2362
Fixes #932
Fixes #1233

nojaf added 30 commits April 1, 2026 13:46
  Introduce a mutable doubly-linked list (EventList) to replace the
  immutable Queue<WriterEvent> for the writer event stream. This is the
  foundation for in-place event splicing and deferred string
  materialization.

  Key changes:
  - EventList.fs: EventNode/EventList with O(1) append, insert, remove,
    backup/rollback, and CurrentLineContent
  - WriterModel.Lines replaced with LineCount (metadata-only tracking)
  - dump walks the DLL with a StringBuilder instead of reading Lines
  - WithDummy now takes a probe function and encapsulates backup/rollback
  - shortExpressionWithFallback/expressionExceedsPageWidth use
    backup/rollback for speculative formatting
  - leadingExpressionIsMultiline/isMultilineItem rewritten with DLL
    cursor walks instead of Queue.skipExists
  - Add Start/Placeholder WriterEvent cases for future
    colWithNlnWhenItemIsMultiline rework
  - Add design documents for the rethink and comment-after problem

  CodePrinterHelperFunctionsTests all pass. The broader test suite will
  be addressed incrementally.
  Add 29 new tests to CodePrinterHelperFunctionsTests covering dump edge
  cases, separator helpers, speculative formatting probes and rollback,
  leading expression inspection, and colWithNlnWhenItemIsMultiline.
  One test is skipped: colWithNlnWhenItemIsMultiline with all single-line
  items exposes a replay bug where optimistic events aren't rolled back.

  Reorganize Context.fsi into logical groups: Types, Core event machinery,
  Indentation, Separators, Conditionals and combinators, Collection
  traversal, Option handling, Speculative formatting, Leading expression
  inspection, Multiline item handling, WriteBeforeNewline-aware helpers,
  and Stroustrup-specific.
…ultilineInfixExpr

  With the mutable EventList, speculative formatting paths that re-run
  an expression on the original context must roll back the optimistic
  events first. Two places were missing this:

  - colWithNlnWhenItemIsMultiline: when both current and previous items
    are single-line, the optimistic blank-line events were still in the
    DLL before replaying with just the regular separator.

  - genMultilineInfixExpr: when a match LHS has a multiline last clause,
    the speculative genExpr events were still in the DLL before
    re-formatting with autoParenthesisIfExpressionExceedsPageWidth.
  expressionExceedsPageWidth wrappers with a LongExpressionLayout DU
  (IndentAndUnindent, DoubleIndentAndUnindent, NewlineOnly).

  For layouts that involve unindent, the new expressionExceedsPageWidthWithLayout
  function splices UnIndentBy before trailing trivia (comments followed by
  WriteLineBecauseOfTrivia) instead of appending it after. This ensures
  the trivia newline uses the reduced indent level — the foundation for
  fixing the comment-before-closing-bracket problem (issue 1233).

  Also removes unused dumpAndContinue and whenShortIndent.
  and directives. This replaces fragile string-prefix checks ("//", "(*")
  with a typed event, simplifying CommentOrDefineEvent to just match
  WriteTrivia. genTrivia and genXml now emit writeTrivia instead of !-.

  Add findTrailingTriviaNewline which walks backward from the DLL tail
  past restore/unindent events to find trailing trivia, and
  unindentWithTriviaAwareness which splices UnIndentBy before the trivia
  newline so closing brackets land at the correct indent level.

  Update trivia assignment (Trivia.fs) with findNodeBeforeWithMatchingColumn
  to attach indented comments as ContentAfter of the preceding node at the
  same column, fixing comment placement before closing brackets (issue 1233).
  Add ▼/▲ markers to Oak ToString for trivia debugging.

  Fixes comment before closing list bracket (3079), single-line and
  multiline block comments before closing brackets, hash directives before
  closing brackets, and same-column comment assignment between bindings.
…re closing brackets

  indentSepNlnUnindent now uses unindentWithTriviaAwareness instead of
  plain unindent. Since it has 66 call sites in CodePrinter, this makes
  every indent+content+unindent block automatically splice the unindent
  before trailing trivia — fixing the comment-before-closing-bracket
  problem at the root rather than per call site.

  Trivia assignment (Trivia.fs) now prefers the preceding content node
  over a same-column leaf successor (closing brackets like |}, ], )).
  This ensures comments attach as ContentAfter of the last field/item
  rather than ContentBefore of the delimiter.

  Split the composite test 2566 into 4 focused tests. Also apply
  unindentWithTriviaAwareness in genMultilineRecordCopyExpr and use
  sepNlnUnlessLastEventIsNewline in genMultilineRecord to avoid double
  newlines after trivia.
  The trivia-aware unindent already produces a newline at the reduced
  indent level. The sepNln before the closing paren in
  genMultilineFunctionApplicationArguments was adding a second newline,
  creating a blank line. Changed to sepNlnUnlessLastEventIsNewline.

  Updated expected output for tuple comment test — the trailing comment
  stays at content indent (column 4) rather than dropping to column 0.
  When a single-line item has trailing trivia (e.g. a comment), the trivia
  produces a WriteLineBecauseOfTrivia. The replay path in
  colWithNlnWhenItemIsMultiline was adding a separator newline on top of
  that, creating an unwanted blank line.

  Two fixes:
  - Rename newlineBetweenLastWriteEvent to hasBlankLineBeforeLastWrite and
    count WriteLineBecauseOfTrivia alongside WriteLine when detecting
    existing blank lines.
  - Use sepNlnUnlessLastEventIsNewline in the replay path so the separator
    is skipped when trailing trivia already produced a newline.
  The comment is now correctly assigned as ContentAfter on the expression
  inside the lambda body, keeping it at the lambda's indent level instead
  of dropping to column 0 before the closing parenthesis.
  Trailing comments are now ContentAfter on the last list item instead of
  ContentBefore on the closing bracket. This causes some Elmish
  expressions to go multiline where they previously fit on one line,
  because the trailing trivia inflates the expression's apparent width.

  The formatted output is valid and idempotent, but more verbose than
  ideal. Documented as an open problem in writer-events-rethink.md with
  possible solution directions.
  Add let binding example alongside the Elmish case to illustrate how
  trailing ContentAfter trivia forces single-line expressions into
  multiline layout. Frame as an accepted trade-off: indentation
  correctness before closing brackets outweighs occasional verbosity.
  When a binding has trailing trivia (e.g. a comment), the trivia
  newline combined with the separator newline and any ContentBefore
  Newline trivia on the next item created too many newlines, growing
  on each format pass.

  Use sepNlnUnlessLastEventIsNewline in both the optimistic path of
  colWithNlnWhenItemIsMultiline and the plain col path in
  colWithNlnWhenItemIsMultilineUsingConfig so separators respect
  trivia newlines that already exist.
  When both predecessor and successor are at the same column as a
  comment, use the node's range size as a tiebreaker: the larger node
  gets the comment (ContentAfter), equal-sized nodes default to the
  successor (ContentBefore).

  This replaces the leaf-node check (Array.isEmpty children) which was
  too broad — it assigned comments to operators in infix expressions
  instead of to the next operand. The size heuristic correctly
  distinguishes content nodes (e.g. "someItem") from delimiters (e.g.
  "]") without checking specific token text.
  The node size comparison (line count, then width) incorrectly assigned
  comments to larger predecessors in cases like type arguments where
  System.DateTime array (21 chars) beat int (3 chars). The comment
  should stay as ContentBefore of the next sibling in those cases.

  Now only prefer the predecessor when the successor is a closing
  delimiter (], }, |}, ), |)). Use a Set for easy expansion. This
  correctly handles list/record brackets without affecting infix
  expressions, sibling bindings, or type argument lists.
  Replace manual indent +> sepNln +> expr +> unindent +> sepNln with
  indentSepNlnUnindent +> sepNlnUnlessLastEventIsNewline in both
  TryFinally and TryWith expressions. This ensures trailing comments
  inside the try body don't produce a blank line before the with keyword.
  When trailing trivia (e.g. a comment) sits between a function
  expression and its indented arguments, the trivia newline and the
  indent's sepNln produced a double newline. indentSepNlnWithTriviaAwareness
  splices IndentBy before the trivia block so the comment appears at the
  indented level, using the trivia's own newline instead of adding another.

  indentSepNlnUnindent now uses both trivia-aware helpers:
  indentSepNlnWithTriviaAwareness on the leading side and
  unindentWithTriviaAwareness on the trailing side.
…ssue

  When a line comment sits after #endif but before the closing bracket,
  findNodeBeforeWithMatchingColumn matches across directive boundaries,
  reversing the source order of trivia. After formatting moves directives
  to column 0, the column-matching breaks on the second pass.

  Added per-define tests with formatSourceStringWithDefines to make the
  problem visible per combination. All three ignored with detailed
  explanation of the directive boundary problem.
  When a blank line preceded an indented comment (column > 0), the
  Newline trivia at column 0 was assigned to a different node than the
  comment, causing the blank line to be lost after formatting.

  Introduce CommentOnSingleLineWithLeadingNewlines trivia content that
  combines adjacent Newline trivia with the following comment into a
  single unit. The promotion happens in promoteNewlinesBeforeComments
  before trivia assignment, ensuring both are assigned to the same node.

  An adjacency check prevents distant Newline trivia from being
  incorrectly accumulated into the count.
…tests

  addFinalNewline didn't account for trivia newlines hidden behind
  restore/unindent events at the DLL tail. Changed to use
  sepNlnUnlessLastEventIsNewline so the final newline is skipped when
  trivia already produced one.

  Added regression tests for fsprojects#2653 (commented-out match case retains
  indentation) and improved test names for fsprojects#2606 and fsprojects#1716.
  Use sepNlnUnlessLastEventIsNewline for record field separators and
  before the closing brace in Stroustrup record type definitions. This
  prevents double newlines when the last field has trailing comment
  trivia. Removed unnecessary atCurrentColumn wrapper around fields.
Comment thread src/Fantomas.Core/Context.fs Fixed
@nojaf nojaf merged commit 4071780 into fsprojects:main Apr 3, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment