-
-
Notifications
You must be signed in to change notification settings - Fork 338
Lock Ranking
Phase 0 should add rank discipline to the existing H5TS substrate, not attempt a fine-grained locking redesign yet.
| Rank | Meaning | Typical use | Rule |
|---|---|---|---|
API_GATE |
Current public API gate | Existing api_mutex / api_lock | Outermost only |
GLOBAL_REGISTRY |
Process-wide registries | H5I tables, plugin/VOL/VFD registries, global free-list bookkeeping | No file state under it |
HELPER_OBJECT |
Shared helper objects |
H5P/H5S/H5T object instances |
Snapshot and release before VOL/native/file work |
FILE_REGISTRY |
File identity / reopen resolution | Same-file detection, shared-open bookkeeping | Separate from per-file lock |
FILE |
Per-file mutable state |
H5F_t / H5F_shared, per-open native/VOL state |
Outermost file-local lock |
CACHE_MANAGER |
Cache-wide structures | Metadata cache, page buffer, raw chunk-cache manager | No callbacks under it |
CACHE_ENTRY |
One entry / chunk / local node | Individual metadata entry or chunk slot | No second same-rank lock in Phase 0 |
VFD_IO |
Driver-instance/raw-I/O lock | VFD handle state if a lock is unavoidable | Prefer no cache-entry lock when entering |
LEAF_SPIN |
Tiny hot locks | Spinlocks, atomics fallback, runtime micro-locks | No blocking, no callbacks, no further HDF5 locks |
API_GATE is the current top-level HDF5 API lock.
GLOBAL_REGISTRY is where process-wide tables and registries live: H5I global tables, plugin/VOL/VFD registration state, and other library-global registries. Keep this separate from file state so that later work can reason cleanly about “global lookup” versus “specific file mutation.” H5VLprivate.h already distinguishes VOL-managed objects from H5I-managed non-VOL objects, which is a good clue that this boundary should stay explicit.
HELPER_OBJECT is where shared H5P/H5S/H5T object locks would live, but only with a strict snapshot-only rule: lock, validate/copy the needed state into an operation-local context, then unlock before entering VOL/native/file paths. That rule matters because H5VLprivate.h explicitly says property lists and dataspaces are handled through H5I rather than VOL, and non-default DXPL/LAPL objects are used directly rather than copied. If we allow helper-object locks to stay live while descending into file/native code, we are asking for cycles.
FILE_REGISTRY deserves its own rank because multiple opens of the same physical file may not be detectable on some file systems, and if they are not detected, coherent access may not be maintained. That means “file identity / reopen bookkeeping” is a real architectural concern, not a cosmetic one. Keep that bookkeeping outside the per-file lock.
FILE is the rank for H5F_t / H5F_shared mutable state. H5Fprivate.h exposes a lot of file-shared fields through F->shared, including raw-data-cache configuration, page-end metadata thresholds, file-locking state, and the file’s vol_obj. That is exactly the kind of state that wants one clear per-file outer lock class.
CACHE_MANAGER and CACHE_ENTRY should be separate even in Phase 0. H5ACprivate.h enumerates many different metadata cache entry kinds, which is a good signal that “cache-wide structures” and “one specific cached thing” are different lock domains. We should not collapse them into one rank.
VFD_IO is intentionally deeper than file/cache ranks, but the real rule is stronger: prefer not to hold cache-entry locks across raw I/O at all. The rank exists as a safety net, not as permission to build a deep, blocking call tree.
LEAF_SPIN is for tiny runtime locks only. H5TSprivate.h already has spinlock support and other runtime threading primitives, but these should stay terminal: once held, they should not call back into the library, block on condition variables, or acquire higher-level locks.
Leave H5E and H5CX out of the rank graph. The error-stack state is per-thread, and each thread has its own thread-local API-context stack. That is exactly the kind of state we do not want to drag into a lock hierarchy unless forced to.
Do not create a separate “lifecycle” rank in Phase 0. Currently, applications must join all HDF5-using threads before H5close or process exit. That means shutdown is still outside the normal concurrency contract, so Phase 0 should keep H5close on a serialized path under the API gate rather than pretending lifecycle is solved by one more rank.
Acquire locks only in increasing rank order. Release in reverse order.
Do not allow two different locks of the same rank to nest in Phase 0. If that ever becomes necessary later, add a dedicated multi-lock helper that sorts by a deterministic secondary key such as file address or object ID. Do not normalize arbitrary same-rank nesting.
Treat shared vs exclusive mode as the same rank. Lock mode changes semantics, not order.
Do not hold HELPER_OBJECT locks across VOL/native/file work. Snapshot and release.
Do not hold CACHE_MANAGER, CACHE_ENTRY, or VFD_IO across user code, plugin code, filters, or connector callbacks unless the call path uses a designated prepare/restore shim and has been explicitly reviewed.
Keep the API gate in the rank map even though it is transitional. Rows in the concurrency matrix that are still serialized will just use API_GATE; rows promoted later can stop taking it and rely on lower ranks without rewriting the whole annotation scheme.
Give VOL its own rank in Phase 0?
That sounds attractive, but it is fake precision. H5VLprivate.h explicitly separates VOL-managed objects from H5I-managed non-VOL objects like property lists and dataspaces, and H5Fprivate.h exposes the file’s vol_obj directly as part of file state. In Phase 0, connector-class registration belongs in GLOBAL_REGISTRY, and per-open connector/native state should ride under FILE. A standalone VOL rank is more likely to create accidental cycles than to clarify the design at this stage.
So the short version of the actual Phase 0 annotations is:
- add
REQUIRES/EXCLUDESmacros, - define one small rank enum,
- wrap real locks with rank metadata,
- keep a per-thread held-rank stack,
- forbid same-rank nesting,
- keep helper-object locks snapshot-only,
- keep callbacks outside the core lock graph,
- and keep shutdown outside the concurrency contract for now.
That is enough to make later lock-boundary moves more disciplined rather than improvised.
#define H5TS_REQUIRES(...) H5_ATTR_THREAD_ANNOT(requires_capability(__VA_ARGS__))
#define H5TS_REQUIRES_SHARED(...) H5_ATTR_THREAD_ANNOT(requires_shared_capability(__VA_ARGS__))
#define H5TS_EXCLUDES(...) H5_ATTR_THREAD_ANNOT(locks_excluded(__VA_ARGS__))typedef enum H5TS_lock_rank_t {
H5TS_RANK_NONE = 0,
/* Transitional outer gate */
H5TS_RANK_API_GATE = 100,
/* Process-wide shared state */
H5TS_RANK_GLOBAL_REGISTRY = 200,
/* Shared helper objects: snapshot only */
H5TS_RANK_HELPER_OBJECT = 300,
/* File identity / reopen bookkeeping */
H5TS_RANK_FILE_REGISTRY = 400,
/* Per-file mutable state */
H5TS_RANK_FILE = 500,
/* Cache-wide manager state */
H5TS_RANK_CACHE_MANAGER = 600,
/* One metadata/chunk/object-local entry */
H5TS_RANK_CACHE_ENTRY = 700,
/* Driver-instance/raw-I/O serialization, if unavoidable */
H5TS_RANK_VFD_IO = 800,
/* Tiny leaf locks only */
H5TS_RANK_LEAF_SPIN = 900
} H5TS_lock_rank_t;typedef struct H5TS_ranked_rwlock_t {
H5TS_rwlock_t impl;
#ifdef H5TS_DEBUG
H5TS_lock_rank_t rank;
const char *name;
#endif
} H5TS_ranked_rwlock_t;At acquire time, the wrapper asserts that the new lock’s rank is greater than the top of the current thread’s stack; at release time, it asserts reverse-order release. In optimized builds, the rank fields can compile away.
typedef struct H5F_shared_t {
H5TS_ranked_rwlock_t file_lock; /* rank = FILE */
H5TS_ranked_rwlock_t mdc_lock; /* rank = CACHE_MANAGER */
...
} H5F_shared_t;
static herr_t
H5D__preflight_read(H5F_shared_t *fsh, hid_t dxpl_id)
H5TS_REQUIRES_SHARED(fsh->file_lock)
{
...
}static herr_t
H5P__snapshot_dxpl(hid_t dxpl_id, H5D_dxpl_snapshot_t *snap)
{
/* acquire HELPER_OBJECT lock on the plist */
/* copy needed fields into snap */
/* release before any VOL/native/file work */
}H5TS_ASSERT_NO_CORE_LOCKS();
H5TS_user_cb_prepare();
ret = (*user_cb)(...);
H5TS_user_cb_restore();H5TS already has user-callback prepare/restore hooks in the concurrency build, and the developer support API already exposes explicit release/reacquire of the top-level HDF5 lock.