Optimize locks to avoid full iteration of lock map#2701
Conversation
mainnet deploy 5/27/2026
🛡️ AI Review — Skeptic (security review)VERDICT: VULNERABLE BASELINE scrutiny: author has repo write permission, matching author/committer for PR commits, substantial prior subtensor history, and no trusted Gittensor allowlist match found; branch feat/reverse-lock-map -> devnet-ready. The diff does not modify Findings
Prior-comment reconciliation
ConclusionThis appears legitimate, but the runtime-upgrade migrations can still exceed block or memory limits if 📜 Previous run (superseded)
🔍 AI Review — Auditor (domain review)VERDICT: 👎 Gittensor association: LIKELY; author has write permission and substantial prior subtensor contribution history, so review is calibrated as established-contributor correctness review. PR body is substantive and matches the implementation direction. No better duplicate candidate was identifiable from the prefetched open/overlapping PR metadata. Spec-version auto-fix was not applied: this runtime-affecting pallet change leaves local Findings
ConclusionBlocking on migration correctness and bounded-execution issues. The new reverse index is directionally useful, but the one-shot backfill must not leave aggregate conviction state inconsistent or risk an overweight runtime upgrade. |
|
🔄 AI review updated — Skeptic: VULNERABLE |
|
🔄 AI review updated — Skeptic: SAFE Auditor: 👎 |
|
🔄 AI review updated — Skeptic: VULNERABLE |
| let mut scanned_count = 0u64; | ||
| let mut indexed_count = 0u64; | ||
| let mut removed_count = 0u64; | ||
| let lock_keys = Lock::<T>::iter_keys().collect::<sp_std::vec::Vec<_>>(); |
There was a problem hiding this comment.
[HIGH] Unbounded Lock backfill can overweight the runtime upgrade
This still materializes every Lock key and then processes every entry in a single runtime-upgrade call. Lock is user-growable, and returning a larger Weight after the scan does not protect the upgrade from exceeding block execution or memory limits. Bound this migration, prove the current key count is safely below the runtime-upgrade budget, or populate the reverse index lazily/over multiple bounded steps before relying on it.
There was a problem hiding this comment.
This is acceptable, especially at this stage when Lock map is barely populated.
|
🔄 AI review updated — Skeptic: VULNERABLE |
hotfix 6/1/2026
hotfix 6/2/2026
| let mut scanned_count = 0u64; | ||
| let mut indexed_count = 0u64; | ||
| let mut removed_count = 0u64; | ||
| let lock_keys = Lock::<T>::iter_keys().collect::<sp_std::vec::Vec<_>>(); |
There was a problem hiding this comment.
[HIGH] Unbounded Lock backfill can overweight the runtime upgrade
This migration still collects every Lock key and then rolls/saves every entry in a single on_runtime_upgrade path. Lock is user-state, so a larger-than-expected map can make the upgrade block exceed weight or memory limits; the comment about current cardinality is not an enforceable bound. Use a bounded/versioned migration strategy or a hard pre-checked maximum that fails before collecting/processing the full map.
|
🔄 AI review updated — Skeptic: VULNERABLE |
|
Matching SDK PR that fixes e2e tests to handle breaking changes: latent-to/bittensor#3362 |
Fix hotkey swap conviction locks
| let mut scanned_count = 0u64; | ||
| let mut indexed_count = 0u64; | ||
| let mut removed_count = 0u64; | ||
| let lock_keys = Lock::<T>::iter_keys().collect::<sp_std::vec::Vec<_>>(); |
There was a problem hiding this comment.
[HIGH] Unbounded Lock backfill can overweight the runtime upgrade
This migration collects every Lock key into memory and then rolls/saves each entry in one on_runtime_upgrade call. Even if the current chain is expected to have few lock records, this is a runtime upgrade path over state, so an unexpectedly large Lock map can exceed the upgrade block weight or memory budget and halt the upgrade. Bound this migration with a measured hard cap and fail-safe, or move the backfill to a chunked cursor/on-idle migration that accounts reads/writes per batch.
| let mut scanned_count = 0u64; | ||
| let mut indexed_count = 0u64; | ||
| let mut removed_count = 0u64; | ||
| let lock_keys = Lock::<T>::iter_keys().collect::<sp_std::vec::Vec<_>>(); |
There was a problem hiding this comment.
[HIGH] Unbounded Lock backfill can overweight the runtime upgrade
This still collects every Lock key and processes the full map in one on_runtime_upgrade path. The comment says the map is expected to be small, but the code does not enforce a hard bound or fail closed if state grows; a larger-than-expected Lock map can exceed block weight or memory during the upgrade and halt runtime execution. Make this migration bounded/chunked, or gate it on a precomputed audited maximum with weight that accounts for the worst case.
| let locks: Vec<(T::AccountId, LockState)> = Lock::<T>::iter() | ||
| .filter_map(|((coldkey, lock_netuid, hotkey), lock)| { | ||
| (lock_netuid == netuid && hotkey == old_hotkey).then_some((coldkey, lock)) | ||
| }) | ||
| .collect(); |
There was a problem hiding this comment.
[HIGH] Swap-fix migration also scans the full Lock map without a bound
For the coldkey: None fix entry this path iterates the entire Lock map, filters in memory, and collects matches before continuing. The reported weight later uses only locks_to_fix_count, so non-matching scanned entries are not represented even though they are read. A large Lock map can make the runtime upgrade overweight or memory-heavy. Avoid the global scan by enumerating explicit coldkeys, using the new reverse index after it is populated, or otherwise bounding and accurately weighting the scan.
|
🔄 AI review updated — Skeptic: VULNERABLE |
Description
This branch adds a dust threshold for lock state, so rolled-forward locks whose mass and conviction fall below the threshold are normalized to zero across lock operations, not only during migration. It also adds and wires
LockingColdkeysas a netuid-first index for active lock records, using it to avoid expensive full-map scans in hotkey swap and subnet lock cleanup paths. TheLockingColdkeyspart is a non-functional, optimization-only change: lock behavior is preserved except for the intentional dust normalization, while subnet-scoped cleanup and transfers now use the dedicated(netuid, hotkey) -> Vec<coldkey>index instead of scanning the globalLockmap.Type of Change
Checklist
./scripts/fix_rust.shto ensure my code is formatted and linted correctly