Debug only Matrix resizing bug. Tc threaded solver revamp#362
Merged
Conversation
- Change accumulators from 5 to 3 - Add shared-walk-and-accumulation-thread: true - Update README to reflect 3 accumulators
…=true The multi-thread per rank configuration (3+ walkers/accumulators with shared-walk-and-accumulation-thread) triggers a code bug causing expansion order explosion and chemical potential divergence. shared=false is being removed from the codebase. The only viable working configuration is: - walkers: 1 - accumulators: 1 - shared-walk-and-accumulation-thread: true - MPI parallelism across ranks (mpiexec -n 4) for workstation runs
…_spin_index Tests cover: - Empty configuration returns 0 - All-annihilatable config returns size() sentinel - Manually marking first entry non-annihilatable returns 0 - Inserting non-interacting vertex is found correctly Uses existing G0Setup test fixture with bilayer lattice and StubRng.
- Fix critical memory-corruption bug in Matrix::resize debug path: The column-major indexing was using row-major stride (i * new_size.second + j instead of i + j * leadingDimension()), causing zeros to be written to the wrong locations and corrupting existing matrix data. Also expanded the zeroing to cover all newly exposed elements, not just the bottom-right corner. In CT-AUX, this corruption produced incorrect determinant ratios, which caused the expansion order (number of vertices k) to spike to nonphysical values, eventually grinding the solver to a halt. - Update tutorial input templates (input_sp.json.in, input_tp.json.in) to use max-submatrix-size=256 instead of 16 for reasonable performance. - Regenerate all preconfigured tutorial inputs from the updated templates.
Update both input templates (input_sp.json.in, input_tp.json.in) to use 4 walkers and 4 accumulators with shared-walk-and-accumulation-thread: true. Regenerate all preconfigured tutorial inputs from the updated templates.
maierta
approved these changes
Jun 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixed a rather devious Debug build only bug in the Matrix::resize method. The incorrect zeroing of matrix elements in a corner case of matrix growth resulted in unstable k expansions because of how this effected vertex additions in ctaux. The rather ominous comment on resizing in general still seems to hold, is long standing and should become an issue.
In the course of debugging found no unit tests for dca::phys::solver::ctaux::CT_AUX_HS_configuration and some odd behavior with respect to indexing (see get_first_non_interacting_spin_index()). I just generated unit tests to capture the current behavior. Right now there is only one caller n_tools.hpp and it does the right thing but the size check seems to serve more than on purpose in that code.
This PR is part of a series to return the tc tutorial to a working and easy to run state.