Skip to content
Gerd Heber edited this page Mar 19, 2026 · 1 revision

Phase 0 should add rank discipline to the existing H5TS substrate, not attempt a fine-grained locking redesign yet.

The Phase 0 rank map

Rank Meaning Typical use Rule
API_GATE Current public API gate Existing api_mutex / api_lock Outermost only
GLOBAL_REGISTRY Process-wide registries H5I tables, plugin/VOL/VFD registries, global free-list bookkeeping No file state under it
HELPER_OBJECT Shared helper objects H5P/H5S/H5T object instances Snapshot and release before VOL/native/file work
FILE_REGISTRY File identity / reopen resolution Same-file detection, shared-open bookkeeping Separate from per-file lock
FILE Per-file mutable state H5F_t / H5F_shared, per-open native/VOL state Outermost file-local lock
CACHE_MANAGER Cache-wide structures Metadata cache, page buffer, raw chunk-cache manager No callbacks under it
CACHE_ENTRY One entry / chunk / local node Individual metadata entry or chunk slot No second same-rank lock in Phase 0
VFD_IO Driver-instance/raw-I/O lock VFD handle state if a lock is unavoidable Prefer no cache-entry lock when entering
LEAF_SPIN Tiny hot locks Spinlocks, atomics fallback, runtime micro-locks No blocking, no callbacks, no further HDF5 locks

Mapping current HDF5 concepts to those ranks

API_GATE is the current top-level HDF5 API lock.

GLOBAL_REGISTRY is where process-wide tables and registries live: H5I global tables, plugin/VOL/VFD registration state, and other library-global registries. Keep this separate from file state so that later work can reason cleanly about “global lookup” versus “specific file mutation.” H5VLprivate.h already distinguishes VOL-managed objects from H5I-managed non-VOL objects, which is a good clue that this boundary should stay explicit.

HELPER_OBJECT is where shared H5P/H5S/H5T object locks would live, but only with a strict snapshot-only rule: lock, validate/copy the needed state into an operation-local context, then unlock before entering VOL/native/file paths. That rule matters because H5VLprivate.h explicitly says property lists and dataspaces are handled through H5I rather than VOL, and non-default DXPL/LAPL objects are used directly rather than copied. If we allow helper-object locks to stay live while descending into file/native code, we are asking for cycles.

FILE_REGISTRY deserves its own rank because multiple opens of the same physical file may not be detectable on some file systems, and if they are not detected, coherent access may not be maintained. That means “file identity / reopen bookkeeping” is a real architectural concern, not a cosmetic one. Keep that bookkeeping outside the per-file lock.

FILE is the rank for H5F_t / H5F_shared mutable state. H5Fprivate.h exposes a lot of file-shared fields through F->shared, including raw-data-cache configuration, page-end metadata thresholds, file-locking state, and the file’s vol_obj. That is exactly the kind of state that wants one clear per-file outer lock class.

CACHE_MANAGER and CACHE_ENTRY should be separate even in Phase 0. H5ACprivate.h enumerates many different metadata cache entry kinds, which is a good signal that “cache-wide structures” and “one specific cached thing” are different lock domains. We should not collapse them into one rank.

VFD_IO is intentionally deeper than file/cache ranks, but the real rule is stronger: prefer not to hold cache-entry locks across raw I/O at all. The rank exists as a safety net, not as permission to build a deep, blocking call tree.

LEAF_SPIN is for tiny runtime locks only. H5TSprivate.h already has spinlock support and other runtime threading primitives, but these should stay terminal: once held, they should not call back into the library, block on condition variables, or acquire higher-level locks.

Two important non-ranks

Leave H5E and H5CX out of the rank graph. The error-stack state is per-thread, and each thread has its own thread-local API-context stack. That is exactly the kind of state we do not want to drag into a lock hierarchy unless forced to.

Do not create a separate “lifecycle” rank in Phase 0. Currently, applications must join all HDF5-using threads before H5close or process exit. That means shutdown is still outside the normal concurrency contract, so Phase 0 should keep H5close on a serialized path under the API gate rather than pretending lifecycle is solved by one more rank.

Rules

Acquire locks only in increasing rank order. Release in reverse order.

Do not allow two different locks of the same rank to nest in Phase 0. If that ever becomes necessary later, add a dedicated multi-lock helper that sorts by a deterministic secondary key such as file address or object ID. Do not normalize arbitrary same-rank nesting.

Treat shared vs exclusive mode as the same rank. Lock mode changes semantics, not order.

Do not hold HELPER_OBJECT locks across VOL/native/file work. Snapshot and release.

Do not hold CACHE_MANAGER, CACHE_ENTRY, or VFD_IO across user code, plugin code, filters, or connector callbacks unless the call path uses a designated prepare/restore shim and has been explicitly reviewed.

Keep the API gate in the rank map even though it is transitional. Rows in the concurrency matrix that are still serialized will just use API_GATE; rows promoted later can stop taking it and rely on lower ranks without rewriting the whole annotation scheme.

VOL

Give VOL its own rank in Phase 0?

That sounds attractive, but it is fake precision. H5VLprivate.h explicitly separates VOL-managed objects from H5I-managed non-VOL objects like property lists and dataspaces, and H5Fprivate.h exposes the file’s vol_obj directly as part of file state. In Phase 0, connector-class registration belongs in GLOBAL_REGISTRY, and per-open connector/native state should ride under FILE. A standalone VOL rank is more likely to create accidental cycles than to clarify the design at this stage.

Discipline

So the short version of the actual Phase 0 annotations is:

  • add REQUIRES / EXCLUDES macros,
  • define one small rank enum,
  • wrap real locks with rank metadata,
  • keep a per-thread held-rank stack,
  • forbid same-rank nesting,
  • keep helper-object locks snapshot-only,
  • keep callbacks outside the core lock graph,
  • and keep shutdown outside the concurrency contract for now.

That is enough to make later lock-boundary moves more disciplined rather than improvised.

"Details"

Annotation macros

#define H5TS_REQUIRES(...)        H5_ATTR_THREAD_ANNOT(requires_capability(__VA_ARGS__))
#define H5TS_REQUIRES_SHARED(...) H5_ATTR_THREAD_ANNOT(requires_shared_capability(__VA_ARGS__))
#define H5TS_EXCLUDES(...)        H5_ATTR_THREAD_ANNOT(locks_excluded(__VA_ARGS__))

Ranking

typedef enum H5TS_lock_rank_t {
    H5TS_RANK_NONE            = 0,

    /* Transitional outer gate */
    H5TS_RANK_API_GATE        = 100,

    /* Process-wide shared state */
    H5TS_RANK_GLOBAL_REGISTRY = 200,

    /* Shared helper objects: snapshot only */
    H5TS_RANK_HELPER_OBJECT   = 300,

    /* File identity / reopen bookkeeping */
    H5TS_RANK_FILE_REGISTRY   = 400,

    /* Per-file mutable state */
    H5TS_RANK_FILE            = 500,

    /* Cache-wide manager state */
    H5TS_RANK_CACHE_MANAGER   = 600,

    /* One metadata/chunk/object-local entry */
    H5TS_RANK_CACHE_ENTRY     = 700,

    /* Driver-instance/raw-I/O serialization, if unavoidable */
    H5TS_RANK_VFD_IO          = 800,

    /* Tiny leaf locks only */
    H5TS_RANK_LEAF_SPIN       = 900
} H5TS_lock_rank_t;

Per-thread held-lock stack

typedef struct H5TS_ranked_rwlock_t {
    H5TS_rwlock_t impl;
#ifdef H5TS_DEBUG
    H5TS_lock_rank_t rank;
    const char *name;
#endif
} H5TS_ranked_rwlock_t;

At acquire time, the wrapper asserts that the new lock’s rank is greater than the top of the current thread’s stack; at release time, it asserts reverse-order release. In optimized builds, the rank fields can compile away.

In actual code...

typedef struct H5F_shared_t {
    H5TS_ranked_rwlock_t file_lock;      /* rank = FILE */
    H5TS_ranked_rwlock_t mdc_lock;       /* rank = CACHE_MANAGER */
    ...
} H5F_shared_t;

static herr_t
H5D__preflight_read(H5F_shared_t *fsh, hid_t dxpl_id)
    H5TS_REQUIRES_SHARED(fsh->file_lock)
{
    ...
}

Helper objects

static herr_t
H5P__snapshot_dxpl(hid_t dxpl_id, H5D_dxpl_snapshot_t *snap)
{
    /* acquire HELPER_OBJECT lock on the plist */
    /* copy needed fields into snap */
    /* release before any VOL/native/file work */
}

Callback boundary

H5TS_ASSERT_NO_CORE_LOCKS();
H5TS_user_cb_prepare();
ret = (*user_cb)(...);
H5TS_user_cb_restore();

H5TS already has user-callback prepare/restore hooks in the concurrency build, and the developer support API already exposes explicit release/reacquire of the top-level HDF5 lock.

Clone this wiki locally