Skip to content

Bazel flow: source-built OpenROAD via pinned bazel-orfs submodule#71

Merged
mguthaus merged 8 commits intomainfrom
bazel-upgrade
Apr 22, 2026
Merged

Bazel flow: source-built OpenROAD via pinned bazel-orfs submodule#71
mguthaus merged 8 commits intomainfrom
bazel-upgrade

Conversation

@mguthaus
Copy link
Copy Markdown
Contributor

Summary

Pivots HighTide's Bazel flow from the docker-extracted OpenROAD pattern to a full source build, tracking bazel-orfs as a pinned submodule at ./bazel-orfs and layering HighTide-specific config on top.

  • ./bazel-orfs submodule at 232b2ea4 — HighTide inherits bazel-orfs's Bazel settings via local_path_override + .bazelrc try-import. Patches live as symlinks into bazel-orfs/patches/ (bazel requires patch labels from the main repo).
  • Root MODULE.bazel re-declares the git_override / single_version_override entries that bazel-orfs pins (required at the root), mirroring its current commits (orfs@2f4de31a, openroad@5f1bd87f, qt-bazel@88610497). Drops the docker_orfs_yosys extraction.
  • single_version_override(soplex, 7.1.4.bcr.3) — resolves a transitive boost-compat-level conflict between or-tools@9.15 → soplex → boost.multiprecision@1.87 (compat 108700) and openroad → boost.multiprecision@1.89 (compat 0).
  • yosys-slang built from source and merged into yosys_share via merge_yosys_share in root BUILD.bazel, wired through orfs.default(yosys_share=...).
  • LLVM 20.1.8 + Python 3.13 toolchains registered at the root (bazel-orfs wraps them as dev_dependency=True, which doesn't propagate downstream).
  • .bazelrc.user try-import so individual users can opt into --remote_upload_local_results=true without touching committed config.
  • k8s template overhaul — clones only the bazel-orfs submodule (was pulling all design-src + OpenROAD-flow-scripts + bsg_fakeram, ~GB), drops qt-bazel xcb-cursor-from-source patch (NRP egress can't reach xorg.freedesktop.org; we don't open the GUI), installs runtime libs needed by OpenROAD's GUI build (libxml2 for ld.lld, Qt xcb libs from ldd on the cached binary).

Test plan

  • bazel build //designs/asap7/lfsr:lfsr_final — clean, 16 min cold
  • bazel build //designs/asap7/liteeth/liteeth_mac_axi_mii:liteeth_mac_axi_mii_final — 3,252 local-cache hits, 8 min
  • bazel build //designs/asap7/minimax:minimax_final — clean, uploaded to remote cache
  • bazel build //designs/nangate45/lfsr:lfsr_final — clean
  • K8s: ./k8s/run.sh --branch bazel-upgrade asap7 lfsr — completed successfully, 12,156 remote-cache hits, 7 sandbox exec actions (lfsr flow stages only)
  • K8s: ./k8s/run.sh --branch bazel-upgrade nangate45 lfsr — completed successfully, 12,156 remote-cache hits, 7 sandbox exec
  • Local bazel build //designs/nangate45/lfsr:lfsr_final after k8s populated cache — hit remote cache, 32 s wall (7 remote-cache, 20 internal, 8 sandbox)

Known / intentional

  • hightide_design produces no GDS (and never did via Bazel — bazel-orfs keeps orfs_gds as a separate rule with a mock klayout default). GDS is out of scope for this PR; CLAUDE.md's "Key outputs" section overstates what Bazel produces and should be corrected separately.

bazel-orfs moves from the docker-extracted OpenROAD pattern to a
full source build.  Track bazel-orfs as a submodule at ./bazel-orfs
so most Bazel settings (dep versions, patches, ORFS/OpenROAD/Qt
commits) come from its MODULE.bazel; the root MODULE.bazel
re-declares the git_override / single_version_override entries
(required at the root) and references patch files from the
submodule to avoid duplication.

- Add bazel-orfs submodule pinned at 232b2ea4; .gitignore exception
- Root MODULE.bazel mirrors bazel-orfs's pins for orfs/openroad/
  qt-bazel/yosys; patches labels are //patches:*.patch, each a
  symlink into bazel-orfs/patches/
- single_version_override(soplex, 7.1.4.bcr.3) resolves the
  transitive boost.multiprecision 1.89 compat-level conflict
  (soplex 7.1.4.bcr.1 → boost@1.87 vs openroad → boost@1.89)
- .bazelrc try-imports bazel-orfs/.bazelrc (C++20, isolated
  extensions, OpenROAD GUI build flag, etc.)
- LLVM 20.1.8 + Python 3.13 toolchains registered at the root
  (bazel-orfs wraps them as dev_dependency=True which doesn't
  propagate)
- yosys-slang plugin built from source and merged into yosys_share
  via merge_yosys_share; wired through orfs.default(yosys_share=…)
- Drop docker_yosys.BUILD.bazel and the docker-extract patch

Verified: lfsr + liteeth_mac_axi_mii flows complete through 6_final;
liteeth hits 3252 action-cache entries from the lfsr build.
The bazel-orfs submodule at ./bazel-orfs is required for the build
(local_path_override in MODULE.bazel; patch labels are symlinks into
bazel-orfs/patches/).  Without --recurse-submodules the pod lands
without the submodule and bazel resolution fails immediately.

--shallow-submodules keeps the cold-clone fast.
The .gitignore already anticipates a gitignored .bazelrc.user for
per-user overrides, but nothing imports it.  Add the try-import so
individual users can opt into behaviors (e.g. uploading to the
remote cache) without touching the committed config.
The Bazel flow only needs ./bazel-orfs.  --recurse-submodules was
pulling every design-src submodule (gemmini, vortex, coralnpu, etc.),
OpenROAD-flow-scripts with its own OpenROAD/yosys submodules, and
bsg_fakeram — gigabytes of data used only by the Make flow,
update-rtl, or generate-sram paths.

Replace the args-only invocation with a shell script that clones
the repo shallowly and inits just the bazel-orfs submodule.
The patch rebuilds xcb-util-cursor from xorg.freedesktop.org to
avoid needing libxcb-cursor0 on the host, but (1) NRP cluster
egress can't reach xorg.freedesktop.org (fetch timed out), and
(2) HighTide's Bazel builds never open the Qt GUI — OpenROAD runs
with -exit / save_images.tcl uses offscreen plugin — so the xcb
cursor library is never loaded at runtime.

Matches bazel-orfs/test/downstream/MODULE.bazel.
ld.lld from the LLVM 20.1.8 toolchain release dlopens libxml2.so.2
at link time (only actually used for COFF manifest merging on
Windows, but the dependency is unconditional).  bazel-orfs
neutralizes it in their MODULE.bazel with an abort-on-call stub,
but that setup lives behind dev_dependency=True and is skipped
when bazel-orfs is a downstream module.  Installing the runtime
libxml2 package is the simpler fix for headless cluster builds.
bazel-orfs's .bazelrc sets --@openroad//:platform=gui so the
OpenROAD binary DT_NEEDEDs libX11-xcb.so.1 and the Qt xcb platform
plugin chain.  Even when run headlessly (-exit, no GUI), dynamic
loading fails at exec time without these libs.

Add libx11-xcb1 and the minimal xcb/xkbcommon set matching Qt xcb
platform plugin requirements.
Derived from ldd on the cached openroad binary.  First pass missed
libsm6/libice6 (X11 session-management libs the binary DT_NEEDEDs)
and a couple of xcb subpackages (libxcb-render0, libxcb-shm0,
libxkbcommon0).
@mguthaus mguthaus merged commit 7840285 into main Apr 22, 2026
@mguthaus mguthaus deleted the bazel-upgrade branch April 22, 2026 20:18
mguthaus added a commit that referenced this pull request Apr 24, 2026
…epair

OpenROAD (new pin from #71, tested at 5f1bd87f) crashes deterministically
in cts.tcl's post-CTS repair_timing on this netlist:

  [ERROR ODB-1200] InsertBufferBeforeLoads: Load pin '_34829_/SN' is
  not connected to net '_03825_'.  Error: cts.tcl, 82 ODB-1200

The new RSZ-0100 move sequence puts UnbufferMove first; an unbuffer
step orphans a load pin before a subsequent InsertBufferBeforeLoads
pass tries to reference it, tripping an ODB consistency check at
iter ~141 of a 347-endpoint hold-violation sweep. The worst-slack
endpoint (the reset synchronizer's D pin) is only a spurious display
artifact — the churn is on the datapath hold-violations, not reset.

Workarounds attempted and ruled out:
- set_dont_touch on reset net pattern: doesn't shrink the STA
  endpoint set, and Yosys flat-synth renames the net anyway. No-op.
- set_false_path -from sys_reset + -through reset net candidates:
  RSZ-0046 dropped 347 -> 346 (one endpoint removed), confirming the
  346 others aren't on the reset cone. Same crash.

Only SKIP_CTS_REPAIR_TIMING=1 bypasses the bug. Downstream GRT and
route still run their own hold-repair passes, so final QoR remains
reasonable (41% util, 14886 um^2 on asap7, clean 6_final).

Also keep a principled set_false_path -from sys_reset in the SDC
(sys_reset is an async input synchronized internally by
udpcore_int_rst; the port-to-synchronizer path shouldn't be timed).

Tracking re-enable: #75
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant