Skip to content

Devnet5#106

Open
bomanaps wants to merge 7 commits into
grandinetech:devnet-5from
bomanaps:devnet5
Open

Devnet5#106
bomanaps wants to merge 7 commits into
grandinetech:devnet-5from
bomanaps:devnet5

Conversation

@bomanaps

@bomanaps bomanaps commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

evnet5 port: containers/SignedBlock shape MultiMessageAggregate, late-publish fix, streaming snappy decode, aggregation tightening K=1 age-out, 750ms deadline, target-slot-desc sort, producer cap split 4/8, post-state STF cache, SNARK pool sizing snark=1, verify=1, and release-build perf flags LTO/codegen-units=1/x86-64-v3.

@ArtiomTr ArtiomTr changed the base branch from devnet-4 to devnet-5 June 17, 2026 08:17
Comment thread rust-toolchain.toml Outdated
@@ -1,4 +1,4 @@
[toolchain]
channel = '1.92'
channel = 'nightly'

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we switch to the nightly toolchain?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plonky3 forces it their p3-util crate uses the unstable maybe_uninit_slice feature, so stable Rust fails to build with E0658 and nightly is the only way through

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tracking issue for maybe_uninit_slice was closed: rust-lang/rust#63569.
This probably means that it was stabilized.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPD: look likse p3-util uses assume_init_ref, that was stabilized in 1.93.0: https://releases.rs/docs/1.93.0/. Updating rust to version >= 1.93.0 will fix the issue

Comment thread lean_client/Cargo.toml Outdated
rec_aggregation = { git = "https://github.com/leanEthereum/leanVM.git", rev = "e2592df4e30fdddbbf8ae26a333116c68cec7026" }
leansig = { git = "https://github.com/leanEthereum/leanSig", branch = "devnet4" }
leansig_wrapper = { git = "https://github.com/leanEthereum/leanVM.git", rev = "e2592df4e30fdddbbf8ae26a333116c68cec7026" }
libp2p = { git = "https://github.com/lambdaclass/rust-libp2p.git", rev = "2f14d0ec9665a01cfb6a02326c90628c4bba521c", default-features = false, features = [

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you switch to the lambdaclass libp2p fork? It looks like this fork only adds send_request_with_protocol functionality, which is currently not used anywhere?

@bomanaps bomanaps requested a review from ArtiomTr June 17, 2026 14:14
Comment thread lean_client/Cargo.toml
Comment on lines +341 to +345
[profile.release]
debug = true
split-debuginfo = "packed"
lto = "thin"
codegen-units = 1

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably don't need that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So leanVM upstream itself pins target-cpu=native in its cargo/config.toml, and our x86-64-v3 + LTO + codegen-units=1 is the portable Hetzner-friendly version of the same pattern, so the prover gets the AVX2 vectorization and cross-crate inlining it was tuned for

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how the debug = true/split-debuginfo = 'packed' is related to this? codegen-units=1 is arguable too, but we can keep it for now. If performance matters, then probably lto should be set to true, not to "thin"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped debug = true + split-debuginfo = "packed" and flipped lto = "thin" → lto = true pushed in the latest commit

#[derive(Debug)]
#[debug("[REDACTED]")]
pub struct SecretKey(Vec<u8>);
pub struct SecretKey(XmssSecretKey);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this one changed? previously this was a Vec specifically for zeroization purposes - why change it back to XmssSecretKey?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This old Vec wrapper wasn't actually zeroizing anything because upstream GeneralizedXMSSSecretKey doesn't derive Zeroize at all leansig/src/signature/generalized_xmss.rs:212, and every sign() was re-parsing the whole OTS state tree off those bytes so I switched to holding the parsed struct directly

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean it wasn't zeroizing anything? it was #[derive(Zeroize)]?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My framing was off here sign() was re-parsing the entire OTS state tree off Vec on every call 118× slower than holding the parsed XmssSecretKey directly, and since upstream GeneralizedXMSSSecretKey doesn't derive Zeroize I couldn't propagate it through the wrapper, so I left a TODO(zeroize) at xmss/src/secret_key.rs:14-17 to either upstream the derive or implement Drop manually.

Comment thread lean_client/xmss/src/multi_message.rs Outdated

#[derive(Clone, Debug, Default, Ssz)]
pub struct MultiMessageAggregate {
pub proof: ByteList<MultiMessageAggregateSizeLimit>,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not a good idea to make this public - the proof should be encapsulated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{ proof: ByteList512KiB } is the single-field SSZ container leanSpec declares containers/aggregation.py:176-184, SSZ derive needs field access to serialize it, and hiding it behind a getter would just add another layer without changing what goes on the wire.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. SSZ derive will have the access, no matter if this field is pub or not
  2. I would argue that you don't need a getter at all -- why would someone need to access raw proof bytes?
  3. hiding behind getter makes a lot of sense - this encapsulates logic, so no one can manipulate proof bytes by hand, only generate them via aggregate

Comment on lines +244 to +252
/// Pool used for non-aggregation SNARK verify work (block-verify,
/// reaggregate, gossip-verify). Kept for symmetry with the main executor
/// split — not used by block-signing, which lives on `cpu_snark_executor`.
cpu_verify_executor: Arc<DedicatedExecutor>,
/// Pool used for the SNARK-heavy block-signing work
/// (Type-1 wrap + Type-2 merge inside `sign_block_with_data`). Same pool as
/// aggregation; the propose/aggregation timelines never overlap (interval 0
/// vs interval 2), so co-tenancy here doesn't queue.
cpu_snark_executor: Arc<DedicatedExecutor>,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need separate executors? Using single cpu_normal_executor should be enough

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gossip-verify here can fire several times per slot and would silently queue behind a multi-second aggregation prove if we shared a single executor, so each SNARK-heavy path gets its own serial admission queue while cpu_normal_executor keeps doing what it does well parallel attestation-signing fan-out am open to suggestions as this was the only way I could do this that I thought of.

@bomanaps bomanaps requested a review from ArtiomTr June 19, 2026 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants