Skip to content

Python driver: Add skip_load parameter to skip LOAD 'age' statement#2366

Open
uesleilima wants to merge 4 commits intoapache:masterfrom
uesleilima:master
Open

Python driver: Add skip_load parameter to skip LOAD 'age' statement#2366
uesleilima wants to merge 4 commits intoapache:masterfrom
uesleilima:master

Conversation

@uesleilima
Copy link
Copy Markdown

Fixes #2353

Problem

setUpAge() and Age.connect() unconditionally execute LOAD 'age' (or LOAD '$libdir/plugins/age'). On managed PostgreSQL services like Azure Database for PostgreSQL, the AGE extension is loaded server-side via shared_preload_libraries and the binary is not accessible at the file path expected by LOAD, causing a psycopg.errors.UndefinedFile error. There is no way to use the driver in these environments without bypassing the setup entirely.

Solution

Add a skip_load parameter (default False) to setUpAge(), Age.connect(), and the module-level age.connect(). When True, the LOAD statement is skipped while all other setup is preserved:

  • SET search_path
  • agtype adapter registration
  • graph existence check / creation

This is fully backward-compatible — existing code is unaffected.

Usage

import age

ag = age.connect(
    dsn="host=myserver.postgres.database.azure.com ...",
    graph="my_graph",
    skip_load=True
)

- Add 4 unit tests for setUpAge() skip_load behavior:
  skip_load=True skips LOAD, skip_load=False executes LOAD,
  load_from_plugins integration, and search_path always set.
- Document skip_load in README under new "Managed PostgreSQL Usage"
  section for Azure/AWS RDS/etc. environments.
- Fix syntax in existing load_from_plugins code example.

Made-with: Cursor
Copy link
Copy Markdown

@tcolo tcolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uesleilima thanks a a lot for the PR - please addresse the issue sin the Claude Code Check.


Code Review

Summary

This PR addresses a real and valid problem — managed PostgreSQL environments (Azure, AWS RDS) where LOAD 'age' fails because the extension is pre-loaded via shared_preload_libraries. The implementation is backward-compatible and test coverage is solid.

However, there's a significant design conflict with the already-merged configure_connection() API that should be resolved before merging.


🔴 Critical

skip_load may leak into psycopg.connect() via **kwargs (age/__init__.py)

The current module-level connect() signature is:

def connect(dsn=None, graph=None, connection_factory=None, cursor_factory=ClientCursor, load_from_plugins=False, **kwargs):
    ag.connect(dsn=dsn, ..., load_from_plugins=load_from_plugins, **kwargs)

If skip_load is not explicitly named in this signature, it falls through **kwargsAge.connect(**kwargs)psycopg.connect(**kwargs), which will raise:

ProgrammingError: invalid dsn parameter 'skip_load'

skip_load must be extracted explicitly, the same way load_from_plugins is.


🟡 Suggestions

1. API duplication with configure_connection() (age/age.pysetUpAge())

The already-merged configure_connection(load=False) (see commits 7d64707e, 2a8a7c2d) solves the exact same problem — skipping LOAD 'age' on managed services — with a safer opt-in default. This PR introduces a parallel API for the same use case.

2. Inverted semantics between the two APIs

Function Pattern Safe default
configure_connection() load=False (opt-in) ✅ skips LOAD by default
setUpAge() (this PR) skip_load=False (opt-out) ⚠️ executes LOAD by default

A user reading both APIs won't know which to use for managed services, and the meaning of "safe default" is different for each. If both must coexist, the semantics should be aligned and the distinction documented clearly.

3. Undocumented interaction between skip_load=True and load_from_plugins=True

When both flags are set, LOAD is silently skipped — which is probably correct, but surprising. The docstring should document the precedence, or a ValueError should be raised for the contradictory combination.

4. Call chain not covered by tests

The unit tests mock setUpAge() in isolation, but the path age.connect(skip_load=True)Age.connect()setUpAge() isn't tested end-to-end. A test verifying the parameter is forwarded correctly through the full call chain would catch regressions like the **kwargs leak described above.

5. README should mention configure_connection() as an alternative

The new README section shows age.connect(skip_load=True) as the solution for managed environments. Since configure_connection() also exists and is better suited for pool-based setups, the docs should mention both and clarify the trade-offs.


✅ What Looks Good

  • Fully backward-compatible (skip_load=False default preserves existing behavior)
  • Search path and adapter registration are correctly preserved when skip_load=True
  • Unit tests use mocking cleanly — no live DB required
  • 5 test methods give solid coverage of setUpAge() in isolation
  • The problem (UndefinedFile on Azure PostgreSQL) is well-documented in the PR description

Verdict: Request Changes

The **kwargs leak is a likely runtime error that needs to be addressed. Beyond that, the team should align on whether skip_load on setUpAge() is needed given configure_connection(load=False) already exists, or document a clear distinction between the two. Merging both without resolving this will leave users with two confusingly similar solutions with opposite flag polarities.

- Raise ValueError when skip_load=True and load_from_plugins=True are
  both set (contradictory combination)
- Add end-to-end test verifying skip_load is forwarded through the full
  age.connect() → Age.connect() → setUpAge() call chain
- Replace fragile string assertions with assert_called_with/assert_any_call
- README: mention configure_connection() as the pool-based alternative
  for managed PostgreSQL environments

Made-with: Cursor
@uesleilima uesleilima requested a review from tcolo April 5, 2026 23:01
@MuhammadTahaNaveed MuhammadTahaNaveed requested review from Copilot and removed request for tcolo April 6, 2026 10:30
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-out for executing LOAD 'age' during Python driver connection/setup, enabling use on managed PostgreSQL services (e.g., Azure) where AGE is preloaded via shared_preload_libraries and LOAD fails.

Changes:

  • Add skip_load parameter to setUpAge(), Age.connect(), and module-level age.connect() to bypass LOAD while keeping the rest of setup.
  • Add unit tests validating skip_load behavior and argument forwarding.
  • Update Python driver README with managed-Postgres usage guidance.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
drivers/python/age/age.py Introduces skip_load behavior in setUpAge() and forwards it through Age.connect().
drivers/python/age/init.py Exposes skip_load via module-level connect().
drivers/python/test_age_py.py Adds unit tests for skip_load behavior without requiring a DB connection.
drivers/python/README.md Documents skip_load for managed PostgreSQL environments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +107 to +111
use ```configure_connection()``` instead, which skips ```LOAD 'age'``` by default and is designed for
per-connection setup callbacks:
```python
from age import configure_connection
configure_connection(conn, graph_name='graph_name')
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README references configure_connection() (and claims it skips LOAD 'age' by default), but there is no configure_connection symbol anywhere under drivers/python. This will mislead users; either implement/export configure_connection or remove/update this section to document the supported approach (e.g., calling setUpAge(..., skip_load=True) on each pooled connection).

Suggested change
use ```configure_connection()``` instead, which skips ```LOAD 'age'``` by default and is designed for
per-connection setup callbacks:
```python
from age import configure_connection
configure_connection(conn, graph_name='graph_name')
run AGE setup on each newly created connection and pass ```skip_load=True``` so the driver skips
```LOAD 'age'``` while still performing the remaining setup:
```python
from age import setUpAge
setUpAge(conn, 'graph_name', skip_load=True)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — removed the configure_connection() reference since it's not part of this branch. The README now shows setUpAge(conn, 'graph_name', skip_load=True) for the connection pool example.

# Connect to PostgreSQL Server and establish session and type extension environment.
def connect(self, graph:str=None, dsn:str=None, connection_factory=None, cursor_factory=ClientCursor,
load_from_plugins:bool=False, **kwargs):
load_from_plugins:bool=False, skip_load:bool=False, **kwargs):
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

connection_factory is accepted and forwarded through the public API, but it is never used: Age.connect() ignores it when calling psycopg.connect(...). Either pass it through (if supported by psycopg) or remove/deprecate it to avoid a misleading no-op parameter.

Suggested change
load_from_plugins:bool=False, skip_load:bool=False, **kwargs):
load_from_plugins:bool=False, skip_load:bool=False, **kwargs):
if connection_factory is not None:
raise TypeError(
"connection_factory is not supported by Age.connect(); "
"use psycopg.connect() options supported by this driver instead."
)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

connection_factory is a pre-existing parameter in the codebase (not introduced by this PR). Addressing it would be scope creep — happy to open a separate issue if needed.

configure_connection() is not part of this PR; use the available
setUpAge() API with skip_load=True for the connection pool example.

Made-with: Cursor
@uesleilima
Copy link
Copy Markdown
Author

All review feedback has been addressed:

tcolo's review:

  • skip_load is already explicitly named in __init__.py:connect() and Age.connect() — no kwargs leak.
  • ✅ Added ValueError for contradictory skip_load=True + load_from_plugins=True.
  • ✅ Added end-to-end test verifying skip_load is forwarded through the full age.connect() → Age.connect() → setUpAge() call chain.
  • ✅ README now mentions setUpAge(skip_load=True) for pool-based setups.
  • ✅ Replaced fragile string assertions with assert_called_with / assert_any_call.

Copilot review:

  • ✅ Fixed README: removed configure_connection() reference (not part of this branch); now uses setUpAge(conn, 'graph_name', skip_load=True) for the pool example.
  • ⚠️ connection_factory being unused is a pre-existing issue, out of scope for this PR.

@uesleilima uesleilima requested a review from tcolo April 6, 2026 20:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants