Symmetrization M0: add characterization harness, capture multi-band bug#366
Symmetrization M0: add characterization harness, capture multi-band bug#366tylersax wants to merge 3 commits into
Conversation
Add a per-(model, point-group) characterization harness for single-particle cluster symmetrization, generalizing the existing identity-only symmetrize_test onto real point groups. It is the regression net for upcoming Hamiltonian- derived symmetrization work. Oracle: G0 = [iw + mu - H0]^-1 is deterministic and exactly invariant under every symmetry of H0, so symmetrizing G0 must be a no-op (asserted to 1e-12). Note: this harness flagged a failure the first time it ran, and we are capturing that failure on purpose. Production Symmetrize::execute is not a no-op on the three-band Hubbard model with D4 symmetry. Given the exact, deterministic oracle, the failure is treated as an existing defect in the imposition. Rather than commit a red test, the observed failures are encoded as the expected result. This keeps CI green and makes the test fixture both the record of the bug and the gate for its fix: when the imposition is corrected the assertion flips (forcing the test to be cleaned up). Structure: four tests per case, one binary per case selected via CMake TEST_DEFINES. square/D4 and singleband-chain are clean green baselines; three-band/D4 carries the pinned failure above.
| // Domain singletons are process-global and must be initialized exactly once, even | ||
| // though gtest builds a fresh fixture per test. This static guard lets multiple | ||
| // TYPED_TESTs share one init in a single binary -- tp_accumulator_gpu_test.cpp hits | ||
| // the same singleton constraint but works around it by only ever running one case | ||
| // per binary. reset() is inside the guard because only the first fixture's members | ||
| // were constructed before update_domains() ran (so only they are mis-sized); the H0/ | ||
| // H_symmetry fills are outside because every fixture gets fresh, zero-valued members. | ||
| static bool domain_initialized = false; | ||
| if (!domain_initialized) { | ||
| parameters_.update_domains(); | ||
| H0_.reset(); | ||
| H_symmetry_.reset(); | ||
| domain_initialized = true; | ||
| } |
There was a problem hiding this comment.
@PDoakORNL - want to get your eyes on this approach. I think it works within your test conventions, but please have a look and make sure this passes the sniff test.
There was a problem hiding this comment.
In case you didn't know GoogleTest also calls Setup() and TearDown() methods on each test fixture object. That said I think constructor works fine here and would probably work for some places we use a Setup method for a test fixture class in the codebase.
| // the deterministic G0, suggesting buggy behavior in how multi-band symmetry | ||
| // is imposed. These flip to green when the next milestone in the symmetrization | ||
| // project rewrites the imposition. | ||
| static std::vector<std::string> expectedFailingReps() { return {"k_iw", "r_iw", "r_tau"}; } |
There was a problem hiding this comment.
@PDoakORNL these are the operations that fail the test as written. I believe this is a genuine defect (not the least because it points to a \\FIXME comment in prod), but in order to keep this PR test-only, I'm encoding the failures as expected, and will pull them out once I swing back to the imposition code. Let me know what you think.
There was a problem hiding this comment.
I think this is fine, it would be nicer if we could just expect fail them at a high level so they "counted" as a Failing test but didn't cause a CI failure. Our testing framework doesn't have that feature though.
PDoakORNL
left a comment
There was a problem hiding this comment.
I think this is actually a nice example of an integration test so I'd move it to test/integration/phys/symmetrization
Which will be a new category of integration test for us.
So the only thing I see right now is move this.
Generally I'd say for unit tests you need a class or function(in which case I'd treat the filename as the class for purposes of naming the test) that you are trying to test as standalone as possible (within reason).
| # DCA domain + cluster-symmetry singletons are keyed on dimension/role, not on the | ||
| # lattice, so cases of the same dimension cannot coexist in one process. Each case | ||
| # is selected from the shared source via THIS_TEST_DEFINES (CMake TEST_DEFINES). | ||
| dca_add_gtest(symmetrize_characterization_square_D4_test |
There was a problem hiding this comment.
Of course it would be nice if these could be one typed test executable but I think you are correct that it is problematic with the code's singleton use. I'd also like tests to have more minimal linking but I think at the integration level with the general state of the targets in the code just using ${DCA_LIBS} here with the tight timeline is ok.
| // the deterministic G0, suggesting buggy behavior in how multi-band symmetry | ||
| // is imposed. These flip to green when the next milestone in the symmetrization | ||
| // project rewrites the imposition. | ||
| static std::vector<std::string> expectedFailingReps() { return {"k_iw", "r_iw", "r_tau"}; } |
There was a problem hiding this comment.
I think this is fine, it would be nicer if we could just expect fail them at a high level so they "counted" as a Failing test but didn't cause a CI failure. Our testing framework doesn't have that feature though.
| // Domain singletons are process-global and must be initialized exactly once, even | ||
| // though gtest builds a fresh fixture per test. This static guard lets multiple | ||
| // TYPED_TESTs share one init in a single binary -- tp_accumulator_gpu_test.cpp hits | ||
| // the same singleton constraint but works around it by only ever running one case | ||
| // per binary. reset() is inside the guard because only the first fixture's members | ||
| // were constructed before update_domains() ran (so only they are mis-sized); the H0/ | ||
| // H_symmetry fills are outside because every fixture gets fresh, zero-valued members. | ||
| static bool domain_initialized = false; | ||
| if (!domain_initialized) { | ||
| parameters_.update_domains(); | ||
| H0_.reset(); | ||
| H_symmetry_.reset(); | ||
| domain_initialized = true; | ||
| } |
There was a problem hiding this comment.
In case you didn't know GoogleTest also calls Setup() and TearDown() methods on each test fixture object. That said I think constructor works fine here and would probably work for some places we use a Setup method for a test fixture class in the codebase.
|
Good call, this makes a lot more sense as integration. Updated (with local re-test passing). |
Add a per-(model, point-group) characterization harness for single-particle cluster symmetrization, generalizing the existing identity-only symmetrize_test onto real point groups. It is the regression net for upcoming Hamiltonian- derived symmetrization work.
Oracle: G0 = [iw + mu - H0]^-1 is deterministic and exactly invariant under every symmetry of H0, so symmetrizing G0 must be a no-op (asserted to 1e-12).
Note: this harness flagged a failure, and it captures that failure on purpose. Production
Symmetrize::executeis not a no-op on the three-band Hubbard model with D4 symmetry. Given the exact, deterministic oracle, the failure is treated as an existing defect in the imposition. Rather than commit a red test, the observed failures are encoded as the expected result. So, read this test fixture as both the record of the bug and the gate for its fix: when the imposition is corrected, the assertion flips (forcing the test to be cleaned up).Structure: Four tests per case, one binary per case selected via CMake TEST_DEFINES.
Building & running
cmake -S . -B build -DDCA_WITH_TESTS_FAST=ON cmake --build build -j --target \ symmetrize_characterization_square_D4_test \ symmetrize_characterization_threeband_D4_test \ symmetrize_characterization_singleband_chain_test ctest --test-dir build -R symmetrize_characterization --output-on-failureExpected result
All three binaries PASS, including the three-band case.
The
threeband_D4binary prints per-representation diagnostics that include lines like:These
FAILlines are the captured-baseline residuals.