Skip to content

Fix possible memory and file descriptors leaks#2258

Merged
MuhammadTahaNaveed merged 2 commits intoapache:masterfrom
ZigzagAK:bugfix/possible-memory-leaks
Dec 9, 2025
Merged

Fix possible memory and file descriptors leaks#2258
MuhammadTahaNaveed merged 2 commits intoapache:masterfrom
ZigzagAK:bugfix/possible-memory-leaks

Conversation

@ZigzagAK
Copy link
Copy Markdown
Contributor

@ZigzagAK ZigzagAK commented Dec 3, 2025

Please, review this patch.

Copy link
Copy Markdown
Contributor

@jrgemignani jrgemignani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. Looks good otherwise.

@ZigzagAK ZigzagAK force-pushed the bugfix/possible-memory-leaks branch from 20683d4 to f174545 Compare December 4, 2025 08:36
@ZigzagAK ZigzagAK requested a review from jrgemignani December 4, 2025 08:37
Copy link
Copy Markdown
Contributor

@jrgemignani jrgemignani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZigzagAK There was an update to the master branch that also had changes to age load. Could you rebase this PR off of the latest master, please :) Otherwise, everything looks good.

@ZigzagAK
Copy link
Copy Markdown
Contributor Author

ZigzagAK commented Dec 5, 2025

@ZigzagAK There was an update to the master branch that also had changes to age load. Could you rebase this PR off of the latest master, please :) Otherwise, everything looks good.

Already rebased.

Copy link
Copy Markdown
Member

@MuhammadTahaNaveed MuhammadTahaNaveed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@MuhammadTahaNaveed MuhammadTahaNaveed merged commit 0ea9464 into apache:master Dec 9, 2025
7 checks passed
jrgemignani pushed a commit to jrgemignani/age that referenced this pull request Dec 16, 2025
- Used postgres memory allocation functions instead of standard ones.
- Wrapped main loop of csv loader in PG_TRY block for better error handling.
MuhammadTahaNaveed pushed a commit that referenced this pull request Dec 16, 2025
- Used postgres memory allocation functions instead of standard ones.
- Wrapped main loop of csv loader in PG_TRY block for better error handling.
jrgemignani pushed a commit to jrgemignani/age that referenced this pull request Jan 30, 2026
- Used postgres memory allocation functions instead of standard ones.
- Wrapped main loop of csv loader in PG_TRY block for better error handling.
MuhammadTahaNaveed pushed a commit that referenced this pull request Feb 3, 2026
- Used postgres memory allocation functions instead of standard ones.
- Wrapped main loop of csv loader in PG_TRY block for better error handling.
MuhammadTahaNaveed pushed a commit to MuhammadTahaNaveed/age that referenced this pull request Apr 8, 2026
- Used postgres memory allocation functions instead of standard ones.
- Wrapped main loop of csv loader in PG_TRY block for better error handling.
jrgemignani added a commit that referenced this pull request Apr 8, 2026
* Add index on id columns (#2117)

- Whenever a label will be created, indices on id columns will be
  created by default. In case of vertex, a unique index on id column
  will be created, which will also serve as a unique constraint.
  In case of edge, a non-unique index on start_id and end_id columns
  will be created.

- This change is expected to improve the performance of queries that
  involve joins. From some performance tests, it was observed that
  the performance of queries improved alot.

- Loader was updated to insert tuples in indices as well. This has
  caused to slow the loader down a bit, but it was necessary.

- A bug related to command ids in cypher_delete executor was also fixed.

* Fix possible memory and file descriptors leaks (#2258)

- Used postgres memory allocation functions instead of standard ones.
- Wrapped main loop of csv loader in PG_TRY block for better error handling.

* Restrict age_load commands (#2274)

This PR applies restrictions to the following age_load commands -

    load_labels_from_file()
    load_edges_from_file()

They are now tied to a specific root directory and are required to have a
specific file extension to eliminate any attempts to force them to access
any other files.

Nothing else has changed with the actual command formats or parameters,
only that they work out of the /tmp/age directory and only access files
with an extension of .csv.

Added regression tests and updated the location of the csv files for
those regression tests.

modified:   regress/expected/age_load.out
modified:   regress/sql/age_load.sql
modified:   src/backend/utils/load/age_load.c

* Fix and improve index.sql regression test coverage (#2300)

NOTE: This PR was created with AI tools and a human.

- Remove unused copy command (leftover from deleted agload_test_graph test)
- Replace broken Section 4 that referenced non-existent graph with
  comprehensive WHERE clause tests covering string, int, bool, and float
  properties with AND/OR/NOT operators
- Add EXPLAIN tests to verify index usage:
  - Section 3: Validate GIN indices (load_city_gin_idx, load_country_gin_idx)
    show Bitmap Index Scan for property matching
  - Section 4: Validate all expression indices (city_country_code_idx,
    city_id_idx, city_west_coast_idx, country_life_exp_idx) show Index Scan
    for WHERE clause filtering

All indices now have EXPLAIN verification confirming they are used as expected.

modified:   regress/expected/index.out
modified:   regress/sql/index.sql

* Fix and improve index.sql addendum (#2301)

NOTE: This PR was created with the help of AI tools and a human.

Added additional requested regression tests -

 *EXPLAIN for pattern with WHERE clause

 *EXPLAIN for pattern with filters on both country and city

modified:   regress/expected/index.out
modified:   regress/sql/index.sql

* Replace libcsv with pg COPY for csv loading (#2310)

- Commit also adds permission checks
- Resolves a critical memory spike issue on loading large file
- Use pg's COPY infrastructure (BeginCopyFrom, NextCopyFromRawFields)
  for 64KB buffered CSV parsing instead of libcsv
- Add byte based flush threshold (64KB) matching COPY behavior for memory safety
- Use heap_multi_insert with BulkInsertState for optimized batch inserts
- Add per batch memory context to prevent memory growth during large loads
- Remove libcsv dependency (libcsv.c, csv.h)
- Improves loading performance by 15-20%
- No previous regression tests were impacted
- Added regression tests for permissions/rls
Assisted-by AI

---------

Co-authored-by: Aleksey Konovkin <alkon2000@mail.ru>
Co-authored-by: John Gemignani <jrgemignani@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants