VLE cache: replace snapshot invalidation with per-graph#2376
Open
jrgemignani wants to merge 1 commit intoapache:masterfrom
Open
VLE cache: replace snapshot invalidation with per-graph#2376jrgemignani wants to merge 1 commit intoapache:masterfrom
jrgemignani wants to merge 1 commit intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates Apache AGE’s VLE (variable-length edge) cache invalidation to be graph-specific, avoiding server-wide false invalidations, and reduces VLE cache memory by switching vertex/edge cache entries to “thin” TID-based storage with lazy property fetch.
Changes:
- Replace snapshot-based invalidation with per-graph monotonic version counters backed by DSM (PG17+), SHMEM hooks (PG<17 + shared_preload_libraries), or snapshot fallback.
- Reduce VLE cache memory by storing tuple TIDs in the cache and fetching properties lazily when constructing results.
- Add a VLE edge-match fast path and new regression tests for invalidation + thin-entry property fetching.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| src/include/utils/age_global_graph.h | Exposes graph version counter APIs and SHMEM init hooks. |
| src/backend/utils/adt/age_vle.c | Adds label-only fast path in edge matching to avoid property access. |
| src/backend/utils/adt/age_global_graph.c | Implements version counters + thin entries + lazy property fetch + trigger function. |
| src/backend/executor/cypher_set.c | Increments per-graph version on SET mutations. |
| src/backend/executor/cypher_merge.c | Increments per-graph version when MERGE creates new paths. |
| src/backend/executor/cypher_delete.c | Increments per-graph version on DELETE mutations. |
| src/backend/executor/cypher_create.c | Increments per-graph version on CREATE mutations. |
| src/backend/commands/label_commands.c | Conditionally installs SQL mutation invalidation triggers on new label tables. |
| src/backend/catalog/ag_catalog.c | Intercepts TRUNCATE to invalidate affected graph caches. |
| src/backend/age.c | Registers SHMEM hooks for PG<17 to enable shared invalidation state. |
| sql/age_main.sql | Registers the trigger function in the extension SQL. |
| regress/sql/age_global_graph.sql | Adds regression coverage for invalidation + thin-entry behavior. |
| regress/expected/age_global_graph.out | Adds expected output for the new regression cases. |
| age--1.7.0--y.y.y.sql | Upgrade template adds the trigger function for existing installs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Replace AGE's snapshot-based VLE cache invalidation with per-graph
monotonic version counters in shared memory. The old code compared
PostgreSQL's global xmin/xmax/curcid, causing false cache invalidation
whenever ANY transaction ran on the server — even unrelated ones. This
forced a full hash table rebuild (~138s at SF3) on every VLE query in
any multi-connection environment.
The fix uses three invalidation paths with automatic detection:
- DSM (PG 17+): GetNamedDSMSegment — works without shared_preload_libraries
- SHMEM (PG <17): shmem_request/startup hooks — needs shared_preload_libraries;
functions conditionally compiled via #if PG_VERSION_NUM < 170000
- SNAPSHOT: fallback to original behavior when shared memory unavailable
Version counter increment points:
- Cypher CREATE/DELETE/SET/MERGE via executor hooks
- SQL INSERT/UPDATE/DELETE via auto-installed per-table triggers
- TRUNCATE via ProcessUtility hook interception
New slot allocation in the version counter array uses pg_write_barrier()
before incrementing num_entries to ensure entry visibility on
weak memory-ordering architectures (e.g., ARM).
Additional optimizations:
- Thin entries: vertex/edge hash table entries store 6-byte TID instead of
copied property Datum; properties fetched on demand via heap_fetch only
during result construction. Reduces hash table memory by ~77%.
- Fast path in is_an_edge_match: skip property access for label-only VLE
patterns (e.g., [:KNOWS*1..2]). When property constraints are present,
edge properties are fetched once and cached locally to avoid duplicate
heap access.
- Defensive elog(ERROR) on stale TID in lazy property fetch to catch
invalidation logic bugs.
- Trigger install is conditional — checks if the trigger function exists
in the catalog before attempting installation, ensuring backward
compatibility with older extension SQL versions.
Test results (LDBC SNB benchmark, SF3 — 52.7M edges, 9.3M vertices):
Production simulation (VLE with concurrent background transactions):
Before: 177,188 ms avg per query (full rebuild every time)
After: 15.7 ms avg per query (cache hit)
Speedup: 11,299x
Cold build time:
Before: 186,275 ms
After: 108,955 ms (41% faster — no datumCopy)
LDBC IC1 warm (3-hop VLE, single session):
Before: 219,385 ms
After: 175,249 ms (20% faster — better cache utilization)
Hash table memory (SF3):
Before: ~9 GB
After: ~2.1 GB (77% reduction)
New regression tests in age_global_graph.sql verify:
- VLE cache invalidation after CREATE (path extends)
- VLE cache invalidation after DELETE (path shrinks)
- VLE cache invalidation after SET (property updated via lazy fetch)
- VLE edge property fetch via full path return (weight values in path)
- VLE edge property fetch via UNWIND + relationships() (individual weights)
Regression tests: 32/32 pass
Files changed (14):
src/backend/age.c — shmem hook registration (PG <17)
src/backend/catalog/ag_catalog.c — TRUNCATE interception
src/backend/commands/label_commands.c — conditional trigger auto-install on label creation
src/backend/executor/cypher_create.c — increment_graph_version after CREATE
src/backend/executor/cypher_delete.c — increment_graph_version after DELETE
src/backend/executor/cypher_merge.c — increment_graph_version after MERGE
src/backend/executor/cypher_set.c — increment_graph_version after SET
src/backend/utils/adt/age_global_graph.c — version counter, thin entries, trigger fn, lazy fetch
src/backend/utils/adt/age_vle.c — is_an_edge_match fast path, cached edge property fetch
src/include/utils/age_global_graph.h — conditional declarations
sql/age_main.sql — trigger function registration for next-version SQL
regress/sql/age_global_graph.sql — VLE cache regression tests
regress/expected/age_global_graph.out — expected output for new tests
age--1.7.0--y.y.y.sql — upgrade template: trigger function for existing installs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
214cf00 to
98207f9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replace AGE's snapshot-based VLE cache invalidation with per-graph monotonic version counters in shared memory. The old code compared PostgreSQL's global xmin/xmax/curcid, causing false cache invalidation whenever ANY transaction ran on the server — even unrelated ones. This forced a full hash table rebuild (~138s at SF3) on every VLE query in any multi-connection environment.
The fix uses three invalidation paths with automatic detection:
Version counter increment points:
Additional optimizations:
Test results (LDBC SNB benchmark, SF3 — 52.7M edges, 9.3M vertices):
Production simulation (VLE with concurrent background transactions):
Before: 177,188 ms avg per query (full rebuild every time)
After: 15.7 ms avg per query (cache hit)
Speedup: 11,299x
Cold build time:
Before: 186,275 ms
After: 108,955 ms (41% faster — no datumCopy)
LDBC IC1 warm (3-hop VLE, single session):
Before: 219,385 ms
After: 175,249 ms (20% faster — better cache utilization)
Hash table memory (SF3):
Before: ~9 GB
After: ~2.1 GB (77% reduction)
New regression tests in age_global_graph.sql verify VLE cache invalidation after CREATE, DELETE, and SET operations, plus thin entry property fetch.
Regression tests: 32/32 pass
Files changed (14):
src/backend/age.c — shmem hook registration (PG <17)
src/backend/catalog/ag_catalog.c — TRUNCATE interception
src/backend/commands/label_commands.c — conditional trigger auto-install on label creation
src/backend/executor/cypher_create.c — increment_graph_version after CREATE
src/backend/executor/cypher_delete.c — increment_graph_version after DELETE
src/backend/executor/cypher_merge.c — increment_graph_version after MERGE
src/backend/executor/cypher_set.c — increment_graph_version after SET
src/backend/utils/adt/age_global_graph.c — version counter, thin entries, trigger fn, lazy fetch
src/backend/utils/adt/age_vle.c — is_an_edge_match fast path
src/include/utils/age_global_graph.h — new declarations
sql/age_main.sql — trigger function registration for next-version SQL
regress/sql/age_global_graph.sql — VLE cache regression tests
regress/expected/age_global_graph.out — expected output for new tests
age--1.7.0--y.y.y.sql — upgrade template: trigger function for existing installs