Skip to content

VLE cache: replace snapshot invalidation with per-graph#2376

Open
jrgemignani wants to merge 1 commit intoapache:masterfrom
jrgemignani:update_vle_cache_and_hash
Open

VLE cache: replace snapshot invalidation with per-graph#2376
jrgemignani wants to merge 1 commit intoapache:masterfrom
jrgemignani:update_vle_cache_and_hash

Conversation

@jrgemignani
Copy link
Copy Markdown
Contributor

Replace AGE's snapshot-based VLE cache invalidation with per-graph monotonic version counters in shared memory. The old code compared PostgreSQL's global xmin/xmax/curcid, causing false cache invalidation whenever ANY transaction ran on the server — even unrelated ones. This forced a full hash table rebuild (~138s at SF3) on every VLE query in any multi-connection environment.

The fix uses three invalidation paths with automatic detection:

  • DSM (PG 17+): GetNamedDSMSegment — works without shared_preload_libraries
  • SHMEM (PG <17): shmem_request/startup hooks — needs shared_preload_libraries
  • SNAPSHOT: fallback to original behavior when shared memory unavailable

Version counter increment points:

  • Cypher CREATE/DELETE/SET/MERGE via executor hooks
  • SQL INSERT/UPDATE/DELETE via auto-installed per-table triggers
  • TRUNCATE via ProcessUtility hook interception

Additional optimizations:

  • Thin entries: vertex/edge hash table entries store 6-byte TID instead of copied property Datum; properties fetched on demand via heap_fetch only during result construction. Reduces hash table memory by ~77%.
  • Fast path in is_an_edge_match: skip property access for label-only VLE patterns (e.g., [:KNOWS*1..2]), avoiding unnecessary heap_fetch during DFS traversal.
  • Defensive elog(ERROR) on stale TID in lazy property fetch to catch invalidation logic bugs.
  • Trigger install is conditional — checks if the trigger function exists in the catalog before attempting installation, ensuring backward compatibility with older extension SQL versions.

Test results (LDBC SNB benchmark, SF3 — 52.7M edges, 9.3M vertices):

Production simulation (VLE with concurrent background transactions):
Before: 177,188 ms avg per query (full rebuild every time)
After: 15.7 ms avg per query (cache hit)
Speedup: 11,299x

Cold build time:
Before: 186,275 ms
After: 108,955 ms (41% faster — no datumCopy)

LDBC IC1 warm (3-hop VLE, single session):
Before: 219,385 ms
After: 175,249 ms (20% faster — better cache utilization)

Hash table memory (SF3):
Before: ~9 GB
After: ~2.1 GB (77% reduction)

New regression tests in age_global_graph.sql verify VLE cache invalidation after CREATE, DELETE, and SET operations, plus thin entry property fetch.

Regression tests: 32/32 pass

Files changed (14):
src/backend/age.c — shmem hook registration (PG <17)
src/backend/catalog/ag_catalog.c — TRUNCATE interception
src/backend/commands/label_commands.c — conditional trigger auto-install on label creation
src/backend/executor/cypher_create.c — increment_graph_version after CREATE
src/backend/executor/cypher_delete.c — increment_graph_version after DELETE
src/backend/executor/cypher_merge.c — increment_graph_version after MERGE
src/backend/executor/cypher_set.c — increment_graph_version after SET
src/backend/utils/adt/age_global_graph.c — version counter, thin entries, trigger fn, lazy fetch
src/backend/utils/adt/age_vle.c — is_an_edge_match fast path
src/include/utils/age_global_graph.h — new declarations
sql/age_main.sql — trigger function registration for next-version SQL
regress/sql/age_global_graph.sql — VLE cache regression tests
regress/expected/age_global_graph.out — expected output for new tests
age--1.7.0--y.y.y.sql — upgrade template: trigger function for existing installs

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Apache AGE’s VLE (variable-length edge) cache invalidation to be graph-specific, avoiding server-wide false invalidations, and reduces VLE cache memory by switching vertex/edge cache entries to “thin” TID-based storage with lazy property fetch.

Changes:

  • Replace snapshot-based invalidation with per-graph monotonic version counters backed by DSM (PG17+), SHMEM hooks (PG<17 + shared_preload_libraries), or snapshot fallback.
  • Reduce VLE cache memory by storing tuple TIDs in the cache and fetching properties lazily when constructing results.
  • Add a VLE edge-match fast path and new regression tests for invalidation + thin-entry property fetching.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/include/utils/age_global_graph.h Exposes graph version counter APIs and SHMEM init hooks.
src/backend/utils/adt/age_vle.c Adds label-only fast path in edge matching to avoid property access.
src/backend/utils/adt/age_global_graph.c Implements version counters + thin entries + lazy property fetch + trigger function.
src/backend/executor/cypher_set.c Increments per-graph version on SET mutations.
src/backend/executor/cypher_merge.c Increments per-graph version when MERGE creates new paths.
src/backend/executor/cypher_delete.c Increments per-graph version on DELETE mutations.
src/backend/executor/cypher_create.c Increments per-graph version on CREATE mutations.
src/backend/commands/label_commands.c Conditionally installs SQL mutation invalidation triggers on new label tables.
src/backend/catalog/ag_catalog.c Intercepts TRUNCATE to invalidate affected graph caches.
src/backend/age.c Registers SHMEM hooks for PG<17 to enable shared invalidation state.
sql/age_main.sql Registers the trigger function in the extension SQL.
regress/sql/age_global_graph.sql Adds regression coverage for invalidation + thin-entry behavior.
regress/expected/age_global_graph.out Adds expected output for the new regression cases.
age--1.7.0--y.y.y.sql Upgrade template adds the trigger function for existing installs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Replace AGE's snapshot-based VLE cache invalidation with per-graph
monotonic version counters in shared memory. The old code compared
PostgreSQL's global xmin/xmax/curcid, causing false cache invalidation
whenever ANY transaction ran on the server — even unrelated ones. This
forced a full hash table rebuild (~138s at SF3) on every VLE query in
any multi-connection environment.

The fix uses three invalidation paths with automatic detection:
- DSM (PG 17+): GetNamedDSMSegment — works without shared_preload_libraries
- SHMEM (PG <17): shmem_request/startup hooks — needs shared_preload_libraries;
  functions conditionally compiled via #if PG_VERSION_NUM < 170000
- SNAPSHOT: fallback to original behavior when shared memory unavailable

Version counter increment points:
- Cypher CREATE/DELETE/SET/MERGE via executor hooks
- SQL INSERT/UPDATE/DELETE via auto-installed per-table triggers
- TRUNCATE via ProcessUtility hook interception

New slot allocation in the version counter array uses pg_write_barrier()
before incrementing num_entries to ensure entry visibility on
weak memory-ordering architectures (e.g., ARM).

Additional optimizations:
- Thin entries: vertex/edge hash table entries store 6-byte TID instead of
  copied property Datum; properties fetched on demand via heap_fetch only
  during result construction. Reduces hash table memory by ~77%.
- Fast path in is_an_edge_match: skip property access for label-only VLE
  patterns (e.g., [:KNOWS*1..2]). When property constraints are present,
  edge properties are fetched once and cached locally to avoid duplicate
  heap access.
- Defensive elog(ERROR) on stale TID in lazy property fetch to catch
  invalidation logic bugs.
- Trigger install is conditional — checks if the trigger function exists
  in the catalog before attempting installation, ensuring backward
  compatibility with older extension SQL versions.

Test results (LDBC SNB benchmark, SF3 — 52.7M edges, 9.3M vertices):

  Production simulation (VLE with concurrent background transactions):
    Before: 177,188 ms avg per query (full rebuild every time)
    After:      15.7 ms avg per query (cache hit)
    Speedup: 11,299x

  Cold build time:
    Before: 186,275 ms
    After:  108,955 ms (41% faster — no datumCopy)

  LDBC IC1 warm (3-hop VLE, single session):
    Before: 219,385 ms
    After:  175,249 ms (20% faster — better cache utilization)

  Hash table memory (SF3):
    Before: ~9 GB
    After:  ~2.1 GB (77% reduction)

New regression tests in age_global_graph.sql verify:
- VLE cache invalidation after CREATE (path extends)
- VLE cache invalidation after DELETE (path shrinks)
- VLE cache invalidation after SET (property updated via lazy fetch)
- VLE edge property fetch via full path return (weight values in path)
- VLE edge property fetch via UNWIND + relationships() (individual weights)

Regression tests: 32/32 pass

Files changed (14):
  src/backend/age.c                        — shmem hook registration (PG <17)
  src/backend/catalog/ag_catalog.c         — TRUNCATE interception
  src/backend/commands/label_commands.c    — conditional trigger auto-install on label creation
  src/backend/executor/cypher_create.c     — increment_graph_version after CREATE
  src/backend/executor/cypher_delete.c     — increment_graph_version after DELETE
  src/backend/executor/cypher_merge.c      — increment_graph_version after MERGE
  src/backend/executor/cypher_set.c        — increment_graph_version after SET
  src/backend/utils/adt/age_global_graph.c — version counter, thin entries, trigger fn, lazy fetch
  src/backend/utils/adt/age_vle.c          — is_an_edge_match fast path, cached edge property fetch
  src/include/utils/age_global_graph.h     — conditional declarations
  sql/age_main.sql                         — trigger function registration for next-version SQL
  regress/sql/age_global_graph.sql         — VLE cache regression tests
  regress/expected/age_global_graph.out    — expected output for new tests
  age--1.7.0--y.y.y.sql                    — upgrade template: trigger function for existing installs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jrgemignani jrgemignani force-pushed the update_vle_cache_and_hash branch from 214cf00 to 98207f9 Compare April 9, 2026 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants