Skip to content

Small updates to output variables, addition of FATES_FRACTION variable#854

Merged
glemieux merged 10 commits into
NGEET:masterfrom
adrifoster:history_interface_ilamb
Jul 15, 2022
Merged

Small updates to output variables, addition of FATES_FRACTION variable#854
glemieux merged 10 commits into
NGEET:masterfrom
adrifoster:history_interface_ilamb

Conversation

@adrifoster

Copy link
Copy Markdown
Contributor

This updates a few output variable names and metadata based on some typos that were found, plus adds an additional variable FATES_FRACTION which is the fraction of the HLM gridcell occupied by FATES.

Description:

We create an hio_fates_fraction_si variable which is set to 1.0, since it will be zero (see below) on non-fates columns. The average will then be the total gridcell fates fraction. (see here)

Because of our previous history interface update which flushes all FATES variables to the hlm_hio_ignore_val, we needed to flush this specific variable to zero for this method to work. I added an optional flush_to_zero argument to the set_history_var subroutine (here) which will prompt the subroutine to flush that variable to zero.

Collaborators:

@ckoven

Expectation of Answer Changes:

None, only change should an additional variable and some small changes to variable names and metadata.

Checklist:

  • My change requires a change to the documentation.
  • I have updated the in-code documentation .AND. (the technical note .OR. the wiki) accordingly.
  • I have read the CONTRIBUTING document.
  • FATES PASS/FAIL regression tests were run
  • If answers were expected to change, evaluation was performed and provided

Test Results:

CTSM (or) E3SM (specify which) test hash-tag:

CTSM (or) E3SM (specify which) baseline hash-tag:

FATES baseline hash-tag:

Test Output:

Comment thread main/FatesHistoryInterfaceMod.F90
@glemieux

glemieux commented Jun 1, 2022

Copy link
Copy Markdown
Contributor

I'd planning on coordinating this PR with ESCOMP/CTSM#1515 to update the history variable names in the ctsm test mods.

@adrifoster adrifoster left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!!! Thank you!!

@glemieux

glemieux commented Jun 1, 2022

Copy link
Copy Markdown
Contributor

Aside from the expected NLCOMP and FIELDLIST DIFFs, all expected fates tests pass b4b.

  • cheyenne: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr854_ctsm1515-fates
  • izumi: /scratch/cluster/glemieux/ctsm-tests/tests_pr854_ctsm1515-fates

UPDATE: ERS_D_Ld5.1x1_brazil.I2000Clm50FatesCruRsGs.izumi_nag.clm-FatesColdDefHydro was incorrectly marked as expected failure. See ESCOMP/CTSM#1525 for further details. I should make certain that this PR wasn't the cause of the "new" COMPARE_base_rest failure mode, however unlikely.

Test results for aux_clm will be reported on ESCOMP/CTSM#1515

@glemieux

glemieux commented Jun 2, 2022

Copy link
Copy Markdown
Contributor

The izumi version of ERS_D_Ld5.1x1_brazil.I2000Clm50FatesCruRsGs.izumi_nag.clm-FatesColdDefHydro failing is indeed due to this PR. Reviewing the DIFF shows that it is failing with an error similar to #701 (note that SCPF suffix was changed to SZPF recently:

 FATES_ERRH2O_SZPF   (lndgrid,fates_levscpf,time)  t_index =      6     6
          2      156  (     0,    41,     1) (     0,    54,     1) (     0,    41,     1) (     0,    28,     1)
                 156   3.258187497579002E-09  -3.006245396560824E-16 3.3E-09  3.258187497579002E-09 1.3E-02  5.553617565823288E-10
                 156   2.259984064920486E-15  -3.006245396560824E-16          2.259984064920486E-15          3.389469836376067E-17
                 156  (     0,    41,     1) (     0,    54,     1)
          avg abs field values:    2.444583598049110E-11    rms diff: 2.6E-10   avg rel diff(npos):  1.3E-02
                                   2.229240666610201E-17                        avg decimal digits(ndif):  0.0 worst:  0.0
 RMS FATES_ERRH2O_SZPF                2.6463E-10            NORMALIZED  2.1650E+01

Interestingly, this isn't an issue with the intel version on Cheyenne (which is where the original issue was discovered). I will run this test on the intel compiler on izumi to rule out machine differences.

@glemieux

glemieux commented Jun 2, 2022

Copy link
Copy Markdown
Contributor

The intel version of this test was a bust. It's failing out very early due to some ESMF io errors. Output is here:
/scratch/cluster/glemieux/ctsm-tests/tests_pr854_ctsm1515-comparebasecheck-intel.
@ekluzek is this something specific with izumi and esmf?

This set of changes allows the harvesting module to pass products back to the host. API modifications are still required in CLM, but this feature should work fully with ELM.
@glemieux

glemieux commented Jul 6, 2022

Copy link
Copy Markdown
Contributor

Aside from the expected NLCOMP and FIELDLIST differences, nearly all tests are passing b4b:

  • Izumi: /scratch/cluster/glemieux/ctsm-tests/tests_pr854-fates
  • Cheyenne: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr854_fates

The one test not b4b is failing to run on Izumi:
ERS_D_Ld5.1x1_brazil.I2000Clm50FatesCruRsGs.izumi_nag.clm-FatesColdDefHydro.GC.pr854-fates_nag
The error report is:

[0] Runtime Error: [0] *** Arithmetic exc[0] eption: Float[0] ing overflow - aborti[0] ng
[0] /home/glemieux/ctsm/src/main/ncdio_pio.F90.in,[0]  line [0] 2039: Error o[0] ccurr[0] ed in NCDIO_PIO:NCD_IO_2D_DOUBLE
[0] /home/glemieux/ctsm/src/main/histFileMod.F90, line[0]  3581: Called by HISTFILEMOD:HFIELDS_WRITE
[0] /home/glemieux/ctsm/src/main/histFileMod.F90, line 4099[0] : Called by HISTFILEMOD:HIST_HTAPES_WRAPUP
[0] /home/glemieux/ctsm/src/main/clm_driver.F90, line 1440: Cal[0] led by CLM_DRIVER:CLM_DRV
[0] /home/glemieux/ctsm/src/cpl/nuopc/lnd_comp_nuopc.F90, lin[0] e 893: Cal[0] led by LND_COMP_NUOPC:MODELADVANCE[0]
[0] /home/glemieux/ctsm/components/cmeps/cime_config/../cesm/driver/esmApp.F90, line 141: Called b[0] y ESMAPP
[0] [i041.cgd.ucar.edu:mpi_rank_0][error_sighandler] Caught error: Aborted (signal 6)

UPDATE: writing out the varname during the above routine calls it looks like the issue is again with FATES_ERRH2O_SZPF.

@glemieux

glemieux commented Jul 7, 2022

Copy link
Copy Markdown
Contributor

@rgknox the issue appears to be with the ccohort_hydr%errh2o. There are certain iscpf indices for which this variable is in the E+180 range and higher which I think is causing the overflow. Writing out the intermediate variables that go into calculating the error there doesn't appear to be anything approaching those values. My guess is that since errh2o isn't initialized to any particular value, that some cohort is not having this value calculated and then using a random garbage value. Thoughts?

@glemieux glemieux linked an issue Jul 8, 2022 that may be closed by this pull request
@glemieux

glemieux commented Jul 9, 2022

Copy link
Copy Markdown
Contributor

Retesting after applying 9d9c192, ERS hydro test passes on Izumi with the nag compiler now and also fixes #701. The other Izumi hydro test is not b4b anymore against the latest baseline due to this update. All Cheyenne tests have the same results as noted above in #854 (comment).

File locations:

  • Izumi: /scratch/cluster/glemieux/ctsm-tests/tests_pr854-fates2
  • Cheyenne: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr854-fates-errh20fix

@glemieux glemieux merged commit def6b3e into NGEET:master Jul 15, 2022
@adrifoster adrifoster deleted the history_interface_ilamb branch May 10, 2023 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FATES_ERRH2O_SCPF fails COMPARE_base_rest

4 participants