Skip to content

edges table stores wikilinks as undirected (bidirectional insert) — asymmetric-link detection broken #36

@jdubdevs

Description

@jdubdevs

Summary

When source file A contains a wikilink [[B]], the indexer inserts BOTH (from_file=A, to_file=B, edge_type='wikilink') AND (from_file=B, to_file=A, edge_type='wikilink') into edges — even when B's content has zero wikilinks to A. This makes any directional / asymmetric query against edges unreliable.

Environment

  • engraph 1.6.1
  • macOS 26 / Apple Silicon
  • Vault: 1,646 markdown files; 10,266 edges in stored graph

Reproduction

In my vault, CLAUDE.md contains [[synthesis/positions/coaching-approach]]. The target file synthesis/positions/coaching-approach.md does NOT contain any wikilink to CLAUDE.md:

grep -c "\[\[CLAUDE" synthesis/positions/coaching-approach.md
# Returns: 0

But engraph stores edges in BOTH directions:

SELECT f1.path AS from_file, f2.path AS to_file
FROM edges e
JOIN files f1 ON e.from_file = f1.id
JOIN files f2 ON e.to_file = f2.id
WHERE (f1.path LIKE '%coaching-approach%' AND f2.path = 'CLAUDE.md')
   OR (f1.path = 'CLAUDE.md' AND f2.path LIKE '%coaching-approach%');

Returns two rows:

CLAUDE.md                                    | synthesis/positions/coaching-approach.md | wikilink
synthesis/positions/coaching-approach.md     | CLAUDE.md                                | wikilink

The second row shouldn't exist — coaching-approach.md has no wikilink to CLAUDE.md.

Broader pattern: querying outgoing edges from any file returns its actual wikilink targets PLUS every file that links INTO it.

Expected behavior

For wikilink [[B]] in source file A, the indexer should insert one directed edge: (from_file=A, to_file=B, edge_type='wikilink'). The reverse direction should only exist if B's content actually contains a wikilink to A. This preserves directed-graph semantics — essential for any "what does X reference" / "what references X" / asymmetric-link query.

Actual behavior

Both directions inserted. Schema's UNIQUE(from_file, to_file, edge_type) prevents duplicate insertion of the same directed edge, but the indexer is inserting two distinct directed edges (A→B AND B→A) per wikilink found.

Impact

  • "What does file X reference?" returns false positives (every file that links INTO X also appears as outgoing)
  • Asymmetric-link detection (NOT EXISTS (reverse edge)) is structurally always-false — useless via engraph
  • Outgoing-edge counts are inflated by inbound references
  • Directional graph-traversal queries drift uncontrollably

Suggested investigation

In the wikilink-extraction → edge-insert code path, is there a second insert call for the reverse direction? This might be intentional design (treating Obsidian's auto-backlink display as symmetric edges) but it breaks the documented schema's from_file / to_file directionality.

If bidirectional storage is intentional (matching Obsidian's graph-view semantics), one fix shape: add a separate wikilink_undirected edge_type for that interpretation while keeping wikilink as strictly directional from source-content. This preserves both consumers.

I'm planning to fork + investigate the source for a fix. Happy to send a PR if the cause is tractable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions