Devnet5#106
Conversation
| @@ -1,4 +1,4 @@ | |||
| [toolchain] | |||
| channel = '1.92' | |||
| channel = 'nightly' | |||
There was a problem hiding this comment.
why we switch to the nightly toolchain?
There was a problem hiding this comment.
Plonky3 forces it their p3-util crate uses the unstable maybe_uninit_slice feature, so stable Rust fails to build with E0658 and nightly is the only way through
There was a problem hiding this comment.
The tracking issue for maybe_uninit_slice was closed: rust-lang/rust#63569.
This probably means that it was stabilized.
There was a problem hiding this comment.
UPD: look likse p3-util uses assume_init_ref, that was stabilized in 1.93.0: https://releases.rs/docs/1.93.0/. Updating rust to version >= 1.93.0 will fix the issue
| rec_aggregation = { git = "https://github.com/leanEthereum/leanVM.git", rev = "e2592df4e30fdddbbf8ae26a333116c68cec7026" } | ||
| leansig = { git = "https://github.com/leanEthereum/leanSig", branch = "devnet4" } | ||
| leansig_wrapper = { git = "https://github.com/leanEthereum/leanVM.git", rev = "e2592df4e30fdddbbf8ae26a333116c68cec7026" } | ||
| libp2p = { git = "https://github.com/lambdaclass/rust-libp2p.git", rev = "2f14d0ec9665a01cfb6a02326c90628c4bba521c", default-features = false, features = [ |
There was a problem hiding this comment.
why did you switch to the lambdaclass libp2p fork? It looks like this fork only adds send_request_with_protocol functionality, which is currently not used anywhere?
| [profile.release] | ||
| debug = true | ||
| split-debuginfo = "packed" | ||
| lto = "thin" | ||
| codegen-units = 1 |
There was a problem hiding this comment.
we probably don't need that?
There was a problem hiding this comment.
So leanVM upstream itself pins target-cpu=native in its cargo/config.toml, and our x86-64-v3 + LTO + codegen-units=1 is the portable Hetzner-friendly version of the same pattern, so the prover gets the AVX2 vectorization and cross-crate inlining it was tuned for
There was a problem hiding this comment.
how the debug = true/split-debuginfo = 'packed' is related to this? codegen-units=1 is arguable too, but we can keep it for now. If performance matters, then probably lto should be set to true, not to "thin"?
There was a problem hiding this comment.
Dropped debug = true + split-debuginfo = "packed" and flipped lto = "thin" → lto = true pushed in the latest commit
| #[derive(Debug)] | ||
| #[debug("[REDACTED]")] | ||
| pub struct SecretKey(Vec<u8>); | ||
| pub struct SecretKey(XmssSecretKey); |
There was a problem hiding this comment.
why this one changed? previously this was a Vec specifically for zeroization purposes - why change it back to XmssSecretKey?
There was a problem hiding this comment.
This old Vec wrapper wasn't actually zeroizing anything because upstream GeneralizedXMSSSecretKey doesn't derive Zeroize at all leansig/src/signature/generalized_xmss.rs:212, and every sign() was re-parsing the whole OTS state tree off those bytes so I switched to holding the parsed struct directly
There was a problem hiding this comment.
what do you mean it wasn't zeroizing anything? it was #[derive(Zeroize)]?
There was a problem hiding this comment.
My framing was off here sign() was re-parsing the entire OTS state tree off Vec on every call 118× slower than holding the parsed XmssSecretKey directly, and since upstream GeneralizedXMSSSecretKey doesn't derive Zeroize I couldn't propagate it through the wrapper, so I left a TODO(zeroize) at xmss/src/secret_key.rs:14-17 to either upstream the derive or implement Drop manually.
|
|
||
| #[derive(Clone, Debug, Default, Ssz)] | ||
| pub struct MultiMessageAggregate { | ||
| pub proof: ByteList<MultiMessageAggregateSizeLimit>, |
There was a problem hiding this comment.
probably not a good idea to make this public - the proof should be encapsulated
There was a problem hiding this comment.
{ proof: ByteList512KiB } is the single-field SSZ container leanSpec declares containers/aggregation.py:176-184, SSZ derive needs field access to serialize it, and hiding it behind a getter would just add another layer without changing what goes on the wire.
There was a problem hiding this comment.
- SSZ derive will have the access, no matter if this field is
pubor not - I would argue that you don't need a getter at all -- why would someone need to access raw proof bytes?
- hiding behind getter makes a lot of sense - this encapsulates logic, so no one can manipulate proof bytes by hand, only generate them via aggregate
| /// Pool used for non-aggregation SNARK verify work (block-verify, | ||
| /// reaggregate, gossip-verify). Kept for symmetry with the main executor | ||
| /// split — not used by block-signing, which lives on `cpu_snark_executor`. | ||
| cpu_verify_executor: Arc<DedicatedExecutor>, | ||
| /// Pool used for the SNARK-heavy block-signing work | ||
| /// (Type-1 wrap + Type-2 merge inside `sign_block_with_data`). Same pool as | ||
| /// aggregation; the propose/aggregation timelines never overlap (interval 0 | ||
| /// vs interval 2), so co-tenancy here doesn't queue. | ||
| cpu_snark_executor: Arc<DedicatedExecutor>, |
There was a problem hiding this comment.
why do we need separate executors? Using single cpu_normal_executor should be enough
There was a problem hiding this comment.
gossip-verify here can fire several times per slot and would silently queue behind a multi-second aggregation prove if we shared a single executor, so each SNARK-heavy path gets its own serial admission queue while cpu_normal_executor keeps doing what it does well parallel attestation-signing fan-out am open to suggestions as this was the only way I could do this that I thought of.
evnet5 port: containers/SignedBlock shape MultiMessageAggregate, late-publish fix, streaming snappy decode, aggregation tightening K=1 age-out, 750ms deadline, target-slot-desc sort, producer cap split 4/8, post-state STF cache, SNARK pool sizing snark=1, verify=1, and release-build perf flags LTO/codegen-units=1/x86-64-v3.