Skip to content

Refactor steering code#213

Merged
YuuuXie merged 8 commits into
mainfrom
yuxie1/steering-refactor
May 4, 2026
Merged

Refactor steering code#213
YuuuXie merged 8 commits into
mainfrom
yuxie1/steering-refactor

Conversation

@YuuuXie
Copy link
Copy Markdown
Collaborator

@YuuuXie YuuuXie commented Apr 27, 2026

Refactor: extract steering into modular package with SMC denoiser

Summary

Splits the monolithic steering.py into a well-organized steering/ package, keeping only the SMC (Sequential Monte Carlo) denoiser. This is the first of two PRs to split #203 — FKC-specific code will follow in a separate PR.

Motivation

The original steering.py (358 lines) mixed collective variables, potentials, utility functions. To make a general so we can include both SMC and FKC, this refactor:

  • Separates concerns into focused modules
  • Makes the SMC denoiser available independently of FKC
  • Cleans up denoiser.py by removing FKC-only constructs (GuidedScore)
  • Removes unused config files

Changes

Bumped up version to 1.4.0

Package structure — src/bioemu/steering.py → src/bioemu/steering/:

  ┌───────────────────────────┬──────────────────────────────────────────────┐  
  │ Module                    │ Contents                                     │  
  ├───────────────────────────┼──────────────────────────────────────────────┤  
  │ __init__.py               │ Public API re-exports                        │
  ├───────────────────────────┼──────────────────────────────────────────────┤  
  │ collective_variables.py   │ CollectiveVariable, CaCaDistance,            │  
  │                           │ PairwiseClash                                │  
  ├───────────────────────────┼──────────────────────────────────────────────┤  
  │ potentials.py             │ Potential, UmbrellaPotential                 │  
  ├───────────────────────────┼──────────────────────────────────────────────┤  
  │ utils.py                  │ Resampling, ESS, compute_reward_and_grad,    │
  │                           │ x0/R0 prediction                             │  
  ├───────────────────────────┼──────────────────────────────────────────────┤  
  │ dpm_smc.py                │ dpm_solver_smc, dpm_solver_sde_smc_step      │  
  └───────────────────────────┴──────────────────────────────────────────────┘  

denoiser.py cleanup:

  • Removed GuidedScore dataclass (FKC-only)
  • Extracted shared DPM-Solver helpers (_get_dpm_coefficients,
    _predict_midpoint, second_order_step_dpmsolver_plusplus) used by both
    dpm_solver and dpm_solver_smc
  • Added second_order_step_dpmsolver (midpoint-only ODE step) and _so3_step
    as clean abstractions

sample.py cleanup:

  • Removed _prepare_steering() helper and FKC-specific steering setup
  • Simplified generate_batch() to accept a generic denoiser callable

Config cleanup:

  • Removed unused config/bioemu.yaml (never referenced by any Python code)
  • Removed config/denoiser/stochastic_dpm.yaml (superseded by dpm.yaml)

What's NOT in this PR

  • FKC denoiser (dpm_fkc.py) — will be in PR2
  • GuidedScore usage — removed here, will be re-added for FKC in PR2
  • Any changes to the FKC steering pipeline

YuuuXie and others added 5 commits April 26, 2026 12:49
…noiser

Split the monolithic steering.py into a steering/ package with:
- collective_variables.py: CollectiveVariable base, CaCaDistance, PairwiseClash
- potentials.py: Potential base, UmbrellaPotential
- utils.py: shared utilities (resampling, x0 prediction, reward computation)
- dpm_smc.py: SMC (Sequential Monte Carlo) steered denoiser

Also refactors denoiser.py with DPM-Solver++ helper functions and updates
sample.py to use the Hydra denoiser pattern.

This is the first half of the steering work from PR #203. FKC denoiser,
structure-reference CVs (RMSD, FractionNativeContacts), LinearPotential,
and example notebooks will follow in a separate PR.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Neither file is referenced in code. bioemu.yaml was a Hydra root config
but no @hydra.main or hydra.compose ever uses it. stochastic_dpm.yaml
was only referenced by bioemu.yaml.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Parametrize TestUmbrellaPotentialLossFn (5 methods → 1 parametrized + 2)
- Remove redundant test_two_equal_groups (duplicate of test_uniform_weights)
- Tighten loose thresholds: ESS tolerance 0.01→1e-5, 0.05→0.005,
  energy tolerance 1e-4→1e-6, stratified resample N-5→N-2
- Strengthen smoke tests: add gradient magnitude checks for CaCaDistance
  and antisymmetric projection, use clashing positions for PairwiseClash
- Verify unsteered log_weights are exactly zero (assert_close vs allclose)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move all imports to top of each file
- Consolidate TestCaCaDistance into single test (shape + values + grad)
- Merge test_no_clash + test_offset_respects_separation into one test
- Move ESS and rotation gradient tests from test_denoisers.py to test_utils.py
  (they test utils functions, not denoiser); delete test_denoisers.py
- Remove TestPhysicalSteeringConfig (trivial YAML-loading tests)
- Move TestPotentialForwardBackward to test_potentials.py (where it belongs)
- Extract module-level fixtures (sdes, batch, score_model) in test_integration
- Add TestSmcDpmEquivalence: unsteered SMC vs DPM solver comparison,
  and steered SMC with ess_threshold=0

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move all inline imports to module top in test_integration.py
- TestSmcDpmEquivalence: use noise=0 for deterministic comparison
  (atol=1e-4 vs previous rel_diff<0.5); add test that steered with
  ess_threshold=0 produces same particle set as unsteered
- TestResampleBasedOnLogWeights: consolidate 4 tests into parametrized
  fixture-based test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors BioEmu’s steering functionality into a dedicated bioemu.steering/ package, keeping the SMC (Sequential Monte Carlo) denoiser and related components modular and reusable, while updating sampling/config/docs/tests to use a single Hydra denoiser config for steered runs.

Changes:

  • Extract steering primitives into src/bioemu/steering/ (CVs, potentials, resampling/reward utilities, SMC DPM-Solver).
  • Simplify sampling and configs so steering is provided via a single denoiser_config YAML (no separate steering_config CLI/API argument).
  • Replace the old monolithic steering test with a split test suite (unit + lightweight integration + chignolin e2e).

Reviewed changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/test_steering.py Removes legacy monolithic steering test file.
tests/steering/init.py Adds steering test package marker.
tests/steering/test_utils.py Adds unit tests for resampling/ESS/x0 helpers.
tests/steering/test_potentials.py Adds unit tests for UmbrellaPotential and replacements for break/clash behavior.
tests/steering/test_collective_variables.py Adds unit tests for CaCaDistance and PairwiseClash.
tests/steering/test_integration.py Adds lightweight integration tests for SMC wiring and generate_batch pipeline via mocks.
tests/steering/test_chignolin_e2e.py Adds end-to-end steering tests that run full sample() pipeline.
src/bioemu/steering/init.py Defines steering public API re-exports (CVs, potentials, utils, SMC).
src/bioemu/steering/collective_variables.py Introduces CV base class + CaCaDistance and PairwiseClash.
src/bioemu/steering/potentials.py Introduces potential base class + UmbrellaPotential.
src/bioemu/steering/utils.py Adds steering utilities: config validation, resampling, ESS, reward/grad plumbing, x0/R0 prediction.
src/bioemu/steering/dpm_smc.py Adds SMC DPM-Solver denoiser loop and SMC step implementation.
src/bioemu/steering.py Deletes old monolithic steering module.
src/bioemu/sample.py Removes separate steering config plumbing; denoiser_config now fully configures steering via Hydra.
src/bioemu/denoiser.py Extracts DPM-Solver helper primitives used by both unsteered DPM and SMC solver.
src/bioemu/config/steering/physical_steering.yaml Converts steering config into a self-contained Hydra denoiser config targeting dpm_solver_smc.
src/bioemu/config/denoiser/stochastic_dpm.yaml Removes unused/superseded stochastic DPM config.
src/bioemu/config/bioemu.yaml Removes unused top-level config file.
README.md Updates steering docs/CLI examples to use a single steering YAML passed as --denoiser_config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/bioemu/steering/utils.py
Comment thread src/bioemu/steering/utils.py Outdated
Comment thread src/bioemu/steering/utils.py Outdated
Comment thread src/bioemu/steering/utils.py Outdated
Comment thread tests/steering/test_chignolin_e2e.py
Comment thread src/bioemu/steering/dpm_smc.py
ludwigwinkler
ludwigwinkler previously approved these changes Apr 27, 2026
Comment thread src/bioemu/steering/collective_variables.py
Comment thread src/bioemu/steering/collective_variables.py
Comment thread src/bioemu/steering/utils.py
Comment thread src/bioemu/steering/utils.py
…nsolidation

- Add value range/type checks in validate_steering_config (num_particles,
  ess_threshold, start/end)
- Fix type hints: ChemGraph -> BatchType on resample_based_on_log_weights
- Convert resampling indices to CPU before list indexing (GPU perf fix)
- Use .reshape() instead of .view() for non-contiguous tensor safety
- Fix mutable default set() in dpm_solver_smc signature
- Remove unused ChemGraph import (fixes ruff F401)
- Consolidate e2e tests with pytest.mark.parametrize

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
YuuuXie and others added 2 commits May 4, 2026 08:16
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

Summary

Summary
Generated on: 05/04/2026 - 15:45:47
Parser: Cobertura
Assemblies: 5
Classes: 31
Files: 31
Line coverage: 88.2% (2036 of 2306)
Covered lines: 2036
Uncovered lines: 270
Coverable lines: 2306
Total lines: 7400
Covered branches: 0
Total branches: 0
Method coverage: Feature is only available for sponsors

Coverage

src.bioemu - 88.9%
Name Line Branch
src.bioemu 88.9% ****
init.py 100%
chemgraph.py 100%
convert_chemgraph.py 97.6%
denoiser.py 98.2%
get_embeds.py 92.1%
md_utils.py 85.8%
model_utils.py 78%
models.py 94.1%
run_hpacker.py 0%
sample.py 92.2%
sde_lib.py 86.6%
seq_io.py 100%
shortcuts.py 100%
sidechain_relax.py 75%
so3_sde.py 90.3%
structure_module.py 84.3%
utils.py 65.6%
src.bioemu.colabfold_inline - 79.1%
Name Line Branch
src.bioemu.colabfold_inline 79.1% ****
init.py
features.py 100%
input_parsing.py 100%
model_runner.py 99%
msa_client.py 60.8%
src.bioemu.hpacker_setup - 69.2%
Name Line Branch
src.bioemu.hpacker_setup 69.2% ****
init.py
setup_hpacker.py 69.2%
src.bioemu.steering - 89.8%
Name Line Branch
src.bioemu.steering 89.8% ****
init.py 100%
collective_variables.py 89.6%
dpm_smc.py 100%
potentials.py 95.1%
utils.py 81.3%
src.bioemu.training - 100%
Name Line Branch
src.bioemu.training 100% ****
foldedness.py 100%
loss.py 100%

@YuuuXie YuuuXie merged commit 39958b4 into main May 4, 2026
7 checks passed
@YuuuXie YuuuXie deleted the yuxie1/steering-refactor branch May 4, 2026 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants