feat(word): add comment range highlighting with hover tooltip in HTML preview#77
feat(word): add comment range highlighting with hover tooltip in HTML preview#77xiaopenyoua wants to merge 704 commits into
Conversation
…only keys R12-2 (fuzzer, MEDIUM): sheet-level sort dispatch early-returned when rows.Count == 0, so `sort=XFE asc` / `sort=AAAA asc` on an empty sheet silently returned "Updated" instead of rejecting the invalid column. Move the empty-sheet no-op inside SortRangeRows so column validation runs first, and tighten the XFD-overflow check to fire on any length (was >= 4), catching 3-letter overflows like XFE/ZZZ. R12-3 (fuzzer, LOW): `sort=asc` (column letter forgotten) produced a misleading "Sort column ASC is outside the range A:B". Reject ASC/DESC as column tokens up-front with a targeted "direction keyword, not a column letter" error.
2-series structure (clusteredColumn + paretoLine overlay) matching MSO's cx:chart format. PreparePareto pre-sorts descending; secondary percentage axis (0-100%) for the cumulative line. DetectExtendedChartType handles both OfficeCli- and MSO-authored forms. Bump version to 1.0.48.
Add mergefield as a first-class field type. Usage: officecli add doc.docx "/body/p[1]" --type mergefield --prop fieldName=CustomerName Placeholder text defaults to «fieldName» format (e.g. «CustomerName»). https://claude.ai/code/session_013XdLypgxPSbNA428pzDXB3
- REF: cross-reference bookmark text (--prop bookmarkName, hyperlink) - PAGEREF: cross-reference bookmark page number - SEQ: auto-numbering sequences (--prop identifier=Figure/Table) - IF: conditional field (--prop expression, trueText, falseText) https://claude.ai/code/session_013XdLypgxPSbNA428pzDXB3
Zero-param: SECTIONPAGES, SECTION, CREATEDATE, SAVEDATE, PRINTDATE, EDITTIME, LASTSAVEDBY, NUMWORDS, NUMCHARS, REVNUM, TEMPLATE, COMMENTS, KEYWORDS Parameterized: NOTEREF (bookmarkName), STYLEREF (styleName), DOCPROPERTY (propertyName) https://claude.ai/code/session_013XdLypgxPSbNA428pzDXB3
…ontextualSpacing - FontMetricsReader: include hhea lineGap in ratio for accurate line height - @font-face: add ascent-override/descent-override/line-gap-override - Heading line-height uses font metrics ratio instead of "normal" - Paragraph spacing collapse: subtract prev spaceAfter from spaceBefore - contextualSpacing: suppress spacing between same-style adjacent paragraphs - docGrid type=lines: snap line-height to linePitch multiples - Support contextualSpacing property in set handler (paragraph + style)
…age width Tables with no explicit <w:tblW> were rendered as width:100%, filling the full page even when the <w:tblGrid> specified narrower column widths. Native Word auto-fits such tables to content — compute width from gridCol sum instead. Use max-width for auto layout (allows shrink), width for fixed layout. Also handles tblW type=pct (percentage).
The 'Created: ... (resident started)' message now suggests running officecli close when done, so agents/users can release the file lock immediately instead of waiting 60s idle timeout.
…etter-spacing, effect props Render-comparison testing against native Word found several run-level properties silently dropped or collapsed in HTML preview: - Double strikethrough rendered identical to single (both as text-decoration:line-through). Now adds text-decoration-style:double. - Underline style variants (double/wave/dotted/dash/thick/*Heavy) all collapsed to plain single underline. Mapped each to CSS text-decoration-style and text-decoration-thickness. - w:spacing (character spacing) was ignored. Emit letter-spacing in pt. - Paragraph-add shortcut silently dropped outline/shadow/emboss/imprint/ vanish/rtl/noproof — only the run-add path honored them. Mirrored the 7 missing handlers in the paragraph branch. - MergeRunProperties never merged Spacing or the 6 effect props, so even when written to XML they were dropped during effective-props resolution and never reached the HTML renderer.
…collapse w:tab chars previously all rendered as a single em-space regardless of paragraph tab stops, making 'Left\tCenter\tRight' visually collapse to three adjacent words. Now: - Track per-paragraph tab index in render context - For each tab, look up the Nth declared tab stop and emit an inline-block span with width equal to the distance from the previous stop position - Honor dot/hyphen/underscore leaders on positional stops via CSS border-bottom patterns - Fallback to 36pt (0.5in) when no stops are defined TOC-style right-aligned dot-leader tabs still flow through the existing dot-leader class path.
Section <w:cols w:num="N"/> was previously ignored in the HTML preview — all content rendered single-column regardless of the declared column count. Now emit CSS on .page-body: - column-count:N for num > 1 - column-rule:1px solid for w:sep="true" - column-gap:Xpt from w:space (twips → pt) Line-numbering (w:lnNumType) still TODO — requires per-line markers.
…oWrap Render-comparison testing found several cell/revision rendering gaps: - Tracked insertions (<w:ins>) previously rendered as plain text, losing the author annotation. Now wrap in a .track-ins span with underline + green color, with the author name in a tooltip. - Tracked deletions (<w:del>) were dropped entirely, leaving the reviewer unable to see what was removed. Now render the deleted text inside a .track-del span with strikethrough + red color. - Cell <w:textDirection> btLr/tbRl was ignored — text stayed horizontal where Word rotates 90°. Emit CSS writing-mode:vertical-rl; btLr adds a 180° rotation to flip the reading direction. - Cell <w:noWrap/> was dropped — now emits white-space:nowrap so cell content doesn't wrap.
Two more render gaps caught by comparison testing: - <w:fldChar><w:ffData><w:ffCheckBox> form field checkboxes were dropped entirely in the preview. Now emit ☑ (checked) or ☐ (unchecked) based on w:default or w:checked state, matching Word's native glyph in read-only previews. - <w:w val="N"/> character horizontal scale (narrower/wider glyph rendering) was ignored. Emit CSS transform:scaleX(N/100) with display:inline-block so the scaled width is actually reserved. - MergeRunProperties also merges CharacterScale now, matching the pattern already used for Spacing, so style-inherited scale reaches the renderer. Deferred (complex, need dedicated work): numFmt variants beyond decimal/lowerLetter/lowerRoman; header/footer titlePg+evenOdd; right-aligned tab with non-dot leader; contextualSpacing boundary.
Round 12 comparison found four picture-level visual effects that were silently dropped in the HTML preview: - a:xfrm rot (rotation in 60000ths of a degree) — now emits CSS transform:rotate(Xdeg) on the <img> - a:xfrm flipH/flipV — now emits transform:scaleX(-1) / scaleY(-1), combined with rotate when both present - a:ln (picture outline) — now emits CSS border with width converted from EMU to px and srgbClr mapped to a hex color - a:effectLst a:outerShdw — now emits box-shadow with offset/blur computed from dir (degrees) and dist/blurRad (EMU) Existing crop (a:srcRect) handling is preserved and effects are composed through both the cropped and uncropped image render paths.
…geometry, gradient fill Round 13 comparison found five shape rendering gaps: - a:xfrm rot on standalone shapes was only applied when the shape lived inside a wpg:wgp group; inline shapes rendered upright regardless. Rotation now applies in both code paths. - wps:bodyPr anchor=ctr/b vertical text alignment only worked for group members; standalone shapes ignored it. Now applied in both paths. - prstGeom prst=ellipse/oval rendered as a solid rectangle. Emit border-radius:50% so the shape reads as an oval; prst=roundRect gets a 12px radius approximation. - a:gradFill (solid gradient) was dropped — shape appeared with no background. Now emit CSS linear-gradient from gsLst stops (pos in 1/1000-percent) with angle converted from OOXML 60000ths to CSS deg. Deferred: exotic prstGeom (line, arrow, callout) need SVG authoring, documented in KNOWN_ISSUES.md as a future pass.
Round 15 comparison found that w:tab leader="middleDot" fell through to no leader fill. Native Word renders middleDot as evenly-spaced centered dots between tab stops; the closest CSS approximation is a 2px dotted border which browsers render as a coarser dot pattern visually distinct from the 1px "dot" leader. Drop cap float works in CSS (see XML output) but is blocked by .page-body flex-column layout; logged in KNOWN_ISSUES #7c for a follow-up refactor.
…bidi Round 16 surfaced four i18n rendering gaps: - w:em (dot/comma/circle/underDot) — now emits CSS text-emphasis-style with correct position (over for dot/comma/circle, under for underDot) and webkit prefix for broader browser support. Previously silently dropped. - w:ruby (furigana) — now emits <ruby>base<rt>annotation</rt></ruby>. Previously the whole ruby run was dropped, leaving only surrounding labels. - w:bidi at paragraph level — now emits direction:rtl. Previously the paragraph ignored the hint and relied on content-level detection. - w:rtl at run level — changed unicode-bidi from bidi-override to embed. Override disables Unicode BiDi shaping; for Arabic, that reversed characters within a word and broke contextual ligatures. embed preserves algorithmic shaping while still flowing RTL. - MergeRunProperties now merges Emphasis so style-inherited em isn't dropped during effective-property resolution. Deferred (iOfficeAI#5 in KNOWN_ISSUES): per-script font chain from rFonts ascii/hAnsi/eastAsia/cs — needs per-run glyph range detection.
…e NUMPAGES Round 17 comparison surfaced header/footer rendering gaps: - HeaderPart/FooterPart content only iterated <w:p> children, silently dropping <w:tbl> — layout tables commonly used for 3-column headers/footers rendered empty. - Paragraphs were filtered if they had no text, losing image-only paragraphs (logos, watermarks). Replaced the filter with a check that considers tables, drawings, and field characters as content. - Footer NUMPAGES field was substituted with the cached "1" instead of the actual rendered page count. Added a second placeholder (<!--NUM_PAGES-->) that gets replaced with pageList.Count per page. Deferred (logged in KNOWN_ISSUES iOfficeAI#17+): VML watermark rendering (v:pict/v:textpath), chart legends/data labels — chart SVG renderer emits geometry but not metadata overlays.
Round 19 comparison found that paragraph alignment w:jc="distribute" rendered as plain text-align:justify, leaving the last/only line unstretched. Native Word spreads every line (including single-line paragraphs) to full width with inter-character spacing. Pair text-align:justify with text-align-last:justify + text-justify:inter-character so the last line also stretches. w:jc= "both" retains the plain-justify behavior (last line flows normally).
…(1/8 pt) Round 20 comparison caught asymmetric cell borders rendering with compressed widths. Root cause: OOXML border sz attribute is in 1/8 of a point (8 = 1pt, 24 = 3pt, etc.), but the renderer was dividing by 8 and emitting the result as px. At default 96 DPI that under-rendered 3pt borders as 3px ≈ 2.25pt — visually thin and inconsistent with Word's native rendering. Switch the output unit to pt so declared 1pt / 2pt / 3pt / 4pt borders render at their intended sizes. The double-border minimum threshold was also updated to the pt-equivalent (2.25pt / ≈3px) so double-line style still renders two visible strokes.
… theme part is missing Round 21 comparison caught all w:themeColor references resolving to no color (runs rendered black). Root cause: blank documents created via BlankDocCreator have no <a:theme> part, and GetThemeColors returned an empty dictionary. Word itself falls back to the built-in Office palette for missing themes; the preview now does too. - Added OfficeDefaultThemeColors dictionary with accent1-6, dark1/2, light1/2, hyperlink, followedHyperlink and their aliases (dk1/dk2/lt1/lt2/tx1/tx2/text1/text2/background1/background2). - GetThemeColors fills in any missing standard names after the theme part is consulted, so explicit themes override but unset slots still resolve. - Run color emit path refactored to call ResolveRunColor for consistency with conditional-format and border color paths — single source of truth for themeColor + themeTint/Shade resolution. Fixes themeColor on text, table shading (via existing ResolveShadingFill path), and borders (via existing RenderBorderCss) in one shot since all three consult GetThemeColors.
Round 22 comparison found runs with <w:vanish/> inherited from a character style rendered as visible text (commonly gray) in the HTML preview. Native Word omits vanished content from the default view. Short-circuit RenderRunHtml when the effective run properties carry vanish or specVanish so hidden text doesn't leak into the preview.
…tings Round 24 comparison caught two settings.xml inputs the HTML preview ignored: - w:defaultTabStop set to e.g. 360 twips (0.25in) was overridden by a hardcoded 36pt (0.5in) fallback, so tab columns in documents tuned for tighter grids came out twice as wide as Word rendered them. Now read the setting when no paragraph/style tab stops apply. - w:autoHyphenation was silently dropped. Documents with long words wrapped to the next line without hyphenation, producing a ragged right edge that diverged from Word's justified/hyphenated output. Emit CSS hyphens:auto + -webkit-hyphens:auto on .page-body so the browser uses its language-specific hyphenation dictionaries.
…ark strings aren't lost Full VML geometry rendering is deferred (KNOWN_ISSUES #7e), but text stored inside <v:pict> — WordArt via v:textpath@string and classic watermarks via v:textbox/w:txbxContent — silently disappeared from the preview, taking document information with them. Emit any extracted text inside a <span class="vml-fallback"> italic-gray placeholder so the reader still sees "DRAFT", "WordArt Sample", etc. Proper geometry rendering (rect/oval/line fill/stroke, rotation, absolute positioning) remains deferred.
Round 27 comparison caught SVG images rendering as blank slots in the HTML preview. Office 2019+ stores vector images as a PNG fallback in <a:blip r:embed> plus the actual SVG in an a:extLst extension (asvg:svgBlip r:embed). Many authoring tools emit a 1×1 transparent PNG as the fallback, so the preview showed nothing even though the document had a valid SVG. When the blip contains an asvg:svgBlip child, use its rel id to locate the SVG part instead of the fallback PNG. Embedded as image/svg+xml data URI, the SVG renders in all modern browsers identical to Word.
Round 29 comparison found firstCol tblStylePr applied even when the table had <w:tblLook w:firstColumn="0"/>. Root cause: ParseTableLook used the legacy val hex bitmask whenever it was present, bypassing individual attrs entirely. Per ECMA-376 §17.7.6.7 individual attrs supersede val. When any of firstRow/lastRow/firstColumn/lastColumn/noHBand/noVBand attrs are authored on <w:tblLook>, use them exclusively (so an attr with value="0" turns the bit OFF even if val would set it). Fall back to val only when no individual attrs are present.
Bugs fixed: - AddRun / AddPicture / AddOle / AddHyperlink / AddComment / AddBookmark / AddBreak / AddField / AddSdt: insert-by-index was off by one on paragraphs with pPr (any paragraph with alignment/style/indent). All paragraph-child inserts now route through a new pPr-aware InsertIntoParagraph helper so new elements never precede pPr. - AddBreak / AddField / AddSdt / AddFormField / AddChart: when parent was /body the handlers ignored --index / --after / --before entirely. Now honored via new InsertAtIndexOrAppend helper that also preserves the sectPr-last-child invariant. - AddFormField: the wrapper paragraph was appended with raw AppendChild, landing after sectPr. Fixed. - AddSection / AddToc / AddFootnote / AddEndnote: ignored --index. Now honored. AddToc: title is positioned adjacent to the TOC paragraph instead of being appended at end-of-body when --index is set. - AddAtFindPosition: the inline variant passed a run-count index to the re-entered Add(), which downstream handlers consumed as ChildElements index, causing --before find:<text> to insert before pPr (schema invalid). Converted to ChildElements index before re-entry. - CopyFrom (--from clone): silently ignored --after / --before and bypassed parent/child validation (allowing body-in-body, p-in-p, styles-with- non-style etc.). Now resolves anchors via ResolveAnchorPosition, rejects self/ancestor clones, routes through ValidateParentChild, handles find: anchors, and uses the pPr-aware helper when target is a paragraph. Validation / error surfacing: - New ValidateParentChild gate rejects schema-invalid parent/child combos: paragraph-in-paragraph, table-in-paragraph, anything under /body/sectPr, non-row children under w:tbl, non-cell children under w:tr, non-style children under /styles, and direct children (other than sdtPr/sdtContent) under w:sdt / w:sdtRun. - ParsePath now rejects multi-predicate (p[1][2]), empty predicate (p[]), trailing junk after ], trailing slash, and empty path segments. - --index is rejected early for negative values; leading/trailing whitespace on --index is rejected (was silently trimmed). - Empty find:"" pattern now errors cleanly instead of leaking an Array-dimensions-exceeded .NET exception. - ArgumentOutOfRangeException and other raw internal exceptions are wrapped into clean ArgumentException messages. AddSdt result-path: when parent is under /header[N] or /footer[N], the returned path now stays under that root instead of being rewritten as /body/..., so downstream --after chains keep resolving.
- ValidateParentChild now rejects schema-invalid clones via --from:
- run/hyperlink as direct body children (must live inside a paragraph)
- style outside /styles
- cell inside cell / run inside cell (TableCell accepts only block-level)
- raw <w:sectPr> cloned into /body (singleton; kept distinct from
`--type section` which creates a paragraph-level section break)
- AddBookmark: body-level inserts now go through InsertAtIndexOrAppend so
--index / --after / --before are honored and the sectPr-last-child
invariant is preserved. Return path emits /bookmark[@name=…] which is
actually resolvable by NavigateToElement (bookmarkStart has no local
name `bookmark`).
- Navigation: /footnote[@footnoteId=N] and /endnote[@endnoteId=N] are now
accepted as add/get parents, routing to the corresponding footnote/endnote
body element. Also added a generic @name= matcher for bookmarkStart so
the new bookmark return path resolves.
- ParsePath predicate parsing tightened: only positive integer `[N]`,
`last()`, or `@ident=bare-or-double-quoted-value` are accepted.
Previously `/body/p[XYZ]`, `/body/p[@=X]`, `/body/p[@paraid]`,
`/body/p[@w:paraId="X"]` silently resolved to the first element.
- CopyFrom: cloned paragraphs now regenerate bookmarkStart/End ids and
names so duplicated bookmarks in a clone don't collide with the
original or each other. MapLocalNameToAddType keeps "sectpr" distinct
from "section" so the two ValidateParentChild rules don't conflate.
- SDT validation hint and query bookmark path updated to match the new
selector grammar.
- ValidateParentChild: reject raw <w:sectPr> cloned directly into a paragraph via --from. (--type section still creates a legitimate pPr- wrapped break as before.) - AddBookmark: reject duplicate bookmark names with a clear error so a second add doesn't silently clash with an existing name and leave the /bookmark[@name=X] lookup pointing at the wrong bookmark. - AddBookmark: when parent is /body and --prop text=... is supplied, wrap bookmarkStart + new Run + bookmarkEnd inside a fresh <w:p> so no bare <w:r> lands as a direct body child (runs must live inside paragraphs). - NavigateToElement: resolve /footnote[@footnoteId=N], /footnote[N], /endnote[@endnoteId=N], /endnote[N] at the root, mirroring the AddParentResolver. Paths returned by `add` inside a footnote/endnote now round-trip through `get` and work as --after/--before anchors. - Query: canonicalize emitted paths by stripping the `/document[1]/body[1]` prefix to `/body` so tr/tc/sectPr/ins/del/drawing/commentRange* paths returned by `query` resolve via `get` and are usable as anchors.
- CopyFrom now rejects cloning <w:footnote>, <w:endnote>, <w:comment> as inline content. These elements live in dedicated parts (footnotes.xml, endnotes.xml, comments.xml); inserting them under <w:p> or <w:body> via --from produced schema-invalid OOXML. Users should add via `--type footnote/endnote/comment --prop text=...` which creates the reference + part entry correctly. - CopyFrom now rejects cloning <w:bookmarkStart>/<w:bookmarkEnd> as a standalone element. Bookmarks span content via a start/end pair with matching @id; cloning just the start (as the virtual /bookmark[@name=X] selector resolves to) produced a never-closed bookmark in the target. Error redirects users to clone the containing paragraph or range. - CopyFrom now regenerates w:id on <w:ins>/<w:del> revision elements in the clone (both self and descendants), mirroring the existing paraId and bookmark-id regeneration. Duplicate revision ids after clone previously failed semantic validation.
- `add --type ins|del|moveTo|moveFrom` used to fall through to AddDefault, which wrote the --prop key=value pairs as unnamespaced attributes, never emitted the required w:id/w:author/w:date, and silently destroyed the paragraph's existing runs when --index was omitted. Tracked-change authoring is outside the add command's scope; reject with a clear error pointing users at the normal inline add flow (mirrors the footnote / endnote / comment rejections already in place). - `AddParagraph` now accepts `ilvl` as an alias for `numlevel` when building <w:numPr>, so `add --type paragraph --prop numId=1 --prop ilvl=2` produces the <w:ilvl> child the user expects. Previously ilvl was silently dropped even though `set --prop ilvl=N` on the same paragraph worked.
…after sectPr - AddParagraph: hoisted <w:ilvl> handling out of the numId branch so --prop ilvl=N alone emits <w:numPr><w:ilvl/></w:numPr> consistently with `set --prop ilvl=N`. Added range checks: numId must be >= 0, ilvl must be in [0,8]; out-of-range values now throw instead of silently producing schema-invalid OOXML. - NavigateToElement: top-level /section[N] is now a resolvable anchor; it maps to the Nth paragraph in /body whose <w:pPr> carries a <w:sectPr>. Previously `add --type section` returned a /section[N] path that subsequent --after/--before could not resolve. - ResolveAnchorPosition: reject --after <body-level sectPr> with a clear error. Body-level <w:sectPr> must remain the last child of <w:body>, so "after sectPr" has no valid placement; silently substituting --before semantics was confusing. Paragraph-level sectPr (inside w:pPr) is unaffected.
- `--after find:<substring>` / `--before find:<substring>` for block types (paragraph, table, section, toc, ...) no longer splits the matched paragraph. The previous behavior inherited AddInlineAtSplitPoint's character-offset splitting and produced two paragraph fragments with the new block wedged between them — schema-valid but semantically destructive. For non-inline types we now resolve to the containing paragraph and insert the new block as a sibling. Inline types (run/pagebreak/bookmark/field/inline sdt) keep splitting, which is still the correct semantics. - NavigateToElement now resolves top-level `/formfield[N]` to the paragraph containing the Nth form field's begin-run. Previously `add --type formfield` returned `/formfield[N]` but that path only worked for `get` (special-case) — `--after /formfield[N]` failed "Anchor element not found". Same class as the earlier `/section[N]` fix.
Extends the round 8/9 fix pattern: `add` emits these synthetic paths as the new element's identity, but `--after`/`--before` previously rejected them because NavigateToElement had no routing for these roots. - /chart[N] resolves to the body paragraph containing the Nth w:drawing chart, mirroring GetAllWordCharts' document-order walk. - /toc[N] resolves to the Nth body paragraph carrying a TOC field, mirroring AddToc's counting. - /watermark is a positional no-op in ResolveAnchorPosition (watermarks live in header parts, no body sibling exists). --after /watermark appends, --before /watermark prepends, so the round-trip stays usable.
…watermark-absent anchor - AddChart (both standard + extended-chart branches) now emits the return path using the same document-order traversal the resolver uses. The old insertion-counter approach produced a /chart[N] that the resolver could not map back after --before/--after inserted anywhere but the end. - AddSection return path now mirrors the NavigateToElement /section[N] walker, computing the new section break's document-order position rather than counting insertion events. Fixes --before /section[1] reporting /section[3] when the new section is actually /section[1]. - AddToc rejects header/footer parents with a clear error (TOC field code references body-level headings and is not meaningful in a header/footer part). Prevents the previous /toc[0] return-path contract violation. - ResolveAnchorPosition /watermark handler now errors "Anchor element not found: /watermark" when the document has no watermark. The round 10 no-op behavior was meant for docs that do have one; silently appending when none exists was a contract violation.
…of-range - AddToc return path now mirrors the round 11 AddChart/AddSection fix: computes the new tocPara's position in the doc-order TOC list via FindIndex(ReferenceEquals) instead of a total count. --before /toc[1] now correctly returns /toc[1] rather than /toc[last]. - ResolveAnchorPosition /watermark handler now captures the optional index from /watermark[N]. If the index is less than 1 or greater than the watermark count (0 or 1 in practice), throw "Anchor element not found" to mirror /chart[99] behavior. Bare /watermark keeps its positional-hint no-op when a watermark exists; errors cleanly when absent.
…l count Extends the round 11/12 sweep that corrected AddChart/AddSection/AddToc return-path numbering. Two handlers were missed: - AddTable used parent.Elements<Table>().Count(), so --before /body/tbl[1] returned /body/tbl[3] even though the resolver places the new table at /body/tbl[1] in document order. - AddHyperlink used the same pattern against hlPara.Elements<Hyperlink>(). Both now compute the new element's FindIndex(ReferenceEquals) in its parent's child collection, matching what NavigateToElement reports.
Two linked fixes for display-mode equation adds: - Add.Text.cs: AddEquation display-mode previously computed the return path as /body/oMathPara[total-count], which (a) didn't match the doc-order resolver used by NavigateToElement and (b) pointed at the wrong element after --before / --after inserts. Now counts the insertTarget's direct children in the same sequence the resolver walks (bare M.Paragraph + IsOMathParaWrapperParagraph wrappers), stopping at the newly-inserted wrapper paragraph. - Navigation.cs ResolveAnchorIndex: /body/oMathPara[N] resolves to the inner M.Paragraph, which isn't a direct body child — its wrapper w:p is. Added retargeting so that when the resolved anchor's parent is a pure oMathPara-wrapper paragraph listed as a sibling under the resolution parent, the anchor is hoisted to the wrapper for IndexOf lookup. Restores round-trip: /body/oMathPara[N] is now usable as --after / --before anchor for follow-up adds.
Cloning a paragraph containing a chart or picture via --from produced a
duplicate <wp:docPr> id, which fails OOXML semantic validation ('id'
should have unique value). The existing id-fixup sweep in CopyFrom
already regenerated paraId, textId, bookmark ids, and revision ids;
extended it to walk Descendants<DW.DocProperties>() on the clone and
reassign Id via the doc's existing docPr sequence.
Bookmarks and legacy form fields accept a --prop name=... value that is later referenced via selectors like bookmarkStart[@name=X]. When the name contains '/', '[' or ']', the selector grammar cannot parse it as a literal and the created element becomes unaddressable — users could not get, set, or remove their own bookmark after creating it. AddBookmark and AddFormField now validate the name at input, throwing a clear error listing the offending characters so callers either escape them upstream or pick a different name. OOXML itself doesn't mandate such a restriction, but officecli's own selector grammar has no escape syntax for these chars, so rejecting at the input boundary is the pragmatic fix.
…ph listing - AddBookmark now rejects additional bookmark-name characters (whitespace, leading '@', quotes) that the selector predicate parser can't handle as bare attribute values. The round-17 fix covered /, [, ] but left these other unaddressable-name footguns. - AddParagraph and AddRun route --prop text=... through a new AppendTextWithBreaks helper that tokenizes on \n/\r\n/\r and \t, emitting alternating <w:t> + <w:br/> + <w:tab/> children. Literal newlines and tabs were previously embedded inside <w:t> verbatim; Word and LibreOffice collapse both to a single space on render, so the characters silently disappeared in the finished document. Also normalizes OpenXml SDK's ' /' self-closing form to '/' in the final document.xml for canonical output. - Navigation's Body lister now enumerates children with the same p[N] vs oMathPara[M] bucketing the resolver uses. Previously the lister counted the equation wrapper paragraph under /body/p[N] while the resolver counted it under /body/oMathPara[M], leaving equation paragraphs reported but unaddressable via the emitted p[N] path.
- AddFormField name validation extended to match AddBookmark's post-R18 rules: rejects whitespace, leading '@' or '\'', embedded '"', and duplicate names within the document. Form fields embed a BookmarkStart/End pair with the same name, so the weaker earlier validation produced unaddressable or duplicate bookmarks. - CopyFrom rejects <m:oMathPara> and <m:oMath> as clone sources. Previously --from /body/oMathPara[N] cloned the bare math element into the target, producing schema-invalid OOXML (body cannot contain oMathPara directly). Users should clone the containing paragraph (/body/p[N]) instead — same pattern as the R6 rejections for footnote/endnote/comment/bookmarkStart.
Two invocations of 'add /styles --type style --prop name=DupStyle' previously both succeeded, leaving two <w:style w:styleId=DupStyle> entries in styles.xml and triggering the OOXML semantic check 'styleId should have unique value'. Now: - If --prop id=<explicit> collides with an existing styleId, throw an ArgumentException pointing the caller at a unique id/name. - If only --prop name=<value> was given (id derived implicitly), auto-suffix the derived id (DupStyle, DupStyle2, DupStyle3, ...) so the styles part stays schema-valid without forcing the caller to pre-check. Parallels the R18 bookmark dup-name rejection and R19 formfield dup-name rejection; styles had been the remaining handler without duplicate-id protection.
…ction Two successive 'add /body --type header --prop kind=default' calls used to silently produce a sectPr with two <w:headerReference type=default> entries pointing at different header parts. OOXML allows at most one reference per type per section (default | first | even). Before appending a new reference, check the section's existing references for a matching type and throw ArgumentException pointing the caller at 'remove the existing one first or use --prop type=<first|even>'. Mirrors the R18/R19/R20 dup-rejection pattern.
…ToString
The round-21 dup-reference rejection interpolated preHeaderType /
preFooterType directly into the error message. These are
HeaderFooterValues values whose default ToString emits
'HeaderFooterValues { }' — unhelpful for users. Added a small
HeaderFooterTypeName helper that maps the three possible values back
to 'default', 'first', or 'even' and pass that through the message
templates.
Generalize ApplySlideBackground to accept SlidePart, SlideLayoutPart, or SlideMasterPart — all three share the same p:bg/p:bgPr schema. Query for /slidemaster[N] and /slidelayout[N] now also reports Format["background"], so Set/Get round-trips across all three container types. Supports all existing background values (solid, gradient, image, none) on masters and layouts without new syntax.
Get and Set both called ParsePath before reaching the /formfield[N|name] regex dispatch. ParsePath's generic predicate validator only accepts positive-integer / last() / [@attr=value], so the documented /formfield[name] form (used by formfield-by-name lookup) was rejected with 'Malformed path segment' before the special-case router could fire. Move the /formfield[...] branch above ParsePath in both WordHandler.Query and WordHandler.Set. Remove the now-dead duplicate blocks further down.
…Child AddPicture and AddOle already have explicit TableCell-parent branches (Add.Media.cs) that wrap the inline run in a Paragraph before appending, satisfying the OOXML block-only rule for <w:tc>. But ValidateParentChild rejected picture/ole under TableCell up front, making that wrap code unreachable. Whitelist picture/image/img/ole/oleobject/object/embed in the TableCell branch so the wrap helpers can actually run.
`add --type table --parent /body/p[N] --after find:X` previously: 1. ValidateParentChild rejected 'block type under paragraph' up front. 2. Even if that was bypassed, AddAtFindPosition's block branch silently degraded to 'insert as sibling of the paragraph' (commit e846b16), so the table landed at the end of the whole paragraph — ignoring the caller's find: position when the anchor sat mid-paragraph. Neither matched Word's native 'cursor mid-sentence → Insert → Table' behavior nor the literal semantics of --after find:X. - ValidateParentChild now takes the InsertPosition and lets block-type adds through under a paragraph parent when a find: anchor is present; error message points the non-find: case at /body. - AddAtFindPosition's block branch now: * inserts as a sibling when the anchor lands on a paragraph boundary (splitPoint == 0 or == total length) — no destructive split; * calls the new SplitParagraphAtOffset helper when the anchor is mid-paragraph, producing head paragraph + new block + tail paragraph, with pPr cloned onto the tail so style/numbering/ heading are preserved on both halves. Also reverses the 'do NOT split' comment introduced in e846b16: the destructive-split concern only applies when the system autonomously decides to split; when the caller explicitly names a mid-paragraph anchor, honoring their position is the correct behavior (matching Word's native Enter-key split on paragraph properties).
Add background.mode (stretch/tile/center), background.alpha (0..100), and background.scale (1..500) as canonical dot-keys paired with background=image:. Stretch stays the default so bare background=image:/path behaves identically. - tile: <a:tile sx=sy=scale*1000 algn=tl flip=none> - center: <a:tile sx=sy=100000 algn=ctr> (LibreOffice NO_REPEAT convention) - alpha: <a:alphaModFix amt=alpha*1000> inside <a:blip> Get round-trips only non-default values: no background.mode for stretch, no background.alpha for opaque. Works on /slide[N], /slidemaster[N], and /slidelayout[N]. background.mode/alpha/scale without a paired background key throws; invalid mode/alpha/scale ranges throw.
PR3 lifts two known limitations of the background feature: 1. background.mode/alpha/scale can now be set without re-supplying background=image:<path>. The existing Blip.Embed rel is preserved so the image part is neither duplicated nor orphaned. Mutating alpha/mode against a solid/gradient background or no background throws a clear error directing the user to set background=image:<path> first. 2. Get now accepts /slidemaster[N]/slidelayout[M] in addition to bare /slidelayout[N], so Set and Get are symmetric. The nested path is returned verbatim in the node's Path field.
… preview - Track CommentRangeStart/End to open/close highlight spans - New GetCommentDisplayHtml renders comment author, date, initials as tooltip - CSS hover rule shows tooltip overlay on highlighted text Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
感谢贡献!这个 feature 方向对,实现也基本到位。合并前有两点需要补一下: 1. 请补一张截图(CONTRIBUTING Rule 2) 这是纯视觉 feature,按 CONTRIBUTING.md Rule 2 要求,feature PR 至少需要一张截图展示效果。请在 HTML 预览中悬停到高亮文字上,截一张 tooltip 显示的图贴到 PR description 里。 2. 跨段落批注会破坏 HTML `CommentRangeStart/End` 在 OOXML 中允许跨段落(例如批注从第 1 段的某个词开始,到第 3 段的某个词结束)。当前实现的 `commentDepth` 是 `RenderParagraphContentHtml` 方法内的局部变量,跨段落时:
修复思路:把 `commentDepth` 提升为 renderer 级字段(成员变量),并在段落渲染结束时,如果 `commentDepth > 0`,补一个 ``;下一个段落开始渲染时,如果 `commentDepth > 0`,先补一个 `<span class="comment-highlight">`(不带 tooltip,因为 tooltip 只在 start 段落出现一次即可)。 复现用例:在 Word 中选中一段横跨两个段落的文本,右键加批注,保存后用 `officecli watch` 查看 HTML,会看到 DOM 断裂。 其余的性能优化(Dictionary 缓存)、CSS 分层、测试等,按 CONTRIBUTING 约定是合并后由 maintainer 清理的,你不用处理。补完上面两点我就合。 |
|
@goworm 你举了一个跨 paragraph 批注的例子,我正好有需求要 跨 paragraph 或者 跨 run 进行批注,但是没找到 cli 怎么用才能实现呢。看代码好像不支持啊 |
|
另外批注的 commentReference 是包裹在一个 run 里面的, get 一个段落时,这个 run 是忽略的,但是给这个段落在批注后插入新 run,返回的 run id 没有跳过 commentReference 所在 run 唉,总之,给docx 打批注真是很复杂 |
Summary
Verification
构建
dotnet build src/officecli/officecli.csproj --no-restore
生成含批注的 Word 文档并预览 HTML
officecli blank test.docx
在 Word 中添加一条批注后保存,再用以下命令预览
officecli watch test.docx
鼠标悬停在黄色高亮文字上可看到 tooltip