Skip to content

Race condition with a threaded test with matrix on? #3360

@ekluzek

Description

@ekluzek

Brief summary of bug

In ctsm5.3.065 I had the test

ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.clm-ciso--clm-matrixcnOn_ignore_warnings

fail at runtime. But, after resubmitting it worked fine.

This is just to bring awareness that this is a potential problem and a simple resubmit solves it.

General bug information

CTSM version you are using: ctsm5.3.064-72-g4037e5a8d

Does this bug cause significantly incorrect results in the model's science? no

Configurations affected: threading with soil matrix on

Details of bug

It dies in a SHR_ASSERT statement that's ensuring the matrix size of two sparse matrices multiplied together are the same size.

Important output or errors that show the problem

cesm.log:

dec0933.hsn.de.hpc.ucar.edu 63:  mosart decomp info proc =        63 begr =    255151 endr =    259200 numr =      4050
dec0933.hsn.de.hpc.ucar.edu 21:  ERROR in SparseMatrixMultiplyMod.F90 at line 973
dec0933.hsn.de.hpc.ucar.edu 21: Image              PC                Routine            Line        Source
dec0933.hsn.de.hpc.ucar.edu 21: cesm.exe           0000000004842AF1  shr_abort_mod_mp_         110  shr_abort_mod.F90
dec0933.hsn.de.hpc.ucar.edu 21: cesm.exe           0000000004842A83  shr_abort_mod_mp_          65  shr_abort_mod.F90
dec0933.hsn.de.hpc.ucar.edu 21: cesm.exe           0000000004843A06  shr_assert_mod_mp          95  shr_assert_mod.F90.in
dec0933.hsn.de.hpc.ucar.edu 21: cesm.exe           0000000004843E0A  shr_assert_mod_mp         112  shr_assert_mod.F90.in
dec0933.hsn.de.hpc.ucar.edu 21: cesm.exe           0000000003544EA7  sparsematrixmulti         973  SparseMatrixMultiplyMod.F90
dec0933.hsn.de.hpc.ucar.edu 21: cesm.exe           0000000001629F29  cnsoilmatrixmod_m         614  CNSoilMatrixMod.F90
dec0933.hsn.de.hpc.ucar.edu 21: cesm.exe           0000000003F2B6D6  cndrivermod_mp_cn        1101  CNDriverMod.F90
dec0933.hsn.de.hpc.ucar.edu 21: cesm.exe           0000000001A77D09  cnvegetationfacad        1125  CNVegetationFacade.F90
dec0933.hsn.de.hpc.ucar.edu 21: cesm.exe           0000000000AA2FA1  clm_driver_mp_clm        1147  clm_driver.F90

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugsomething is working incorrectlyclosed: wontfixWe won't fix this issue, because it would be too difficult and/or isn't important enough to fixpriority: lowBackground task that doesn't need to be done right away.

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions