Skip to content

Merge ctsm1.0.dev105 tag into fates_main_api #1137

Merged
rgknox merged 701 commits into
ESCOMP:fates_main_apifrom
glemieux:fma-merge-ctsm1.0.dev105
Oct 8, 2020
Merged

Merge ctsm1.0.dev105 tag into fates_main_api #1137
rgknox merged 701 commits into
ESCOMP:fates_main_apifrom
glemieux:fma-merge-ctsm1.0.dev105

Conversation

@glemieux

@glemieux glemieux commented Sep 1, 2020

Copy link
Copy Markdown
Contributor

Description of changes

This PR brings the fates_main_api branch up to date with ctsm1.0.dev105 tag to incorporate fixes to address #1125.

Specific notes

Contributors other than yourself, if any:

CTSM Issues Fixed (include github issue #): #1125

Are answers expected to change (and if so in what way)? Only fates outputs expected to change relative to ctsm1.0dev.105 tag

Any User Interface Changes (namelist or namelist defaults changes)? No

Testing performed, if any:
aux_clm suite outputs: /glade/u/home/glemieux/scratch/ctsm-tests/tests_ctsmdev105-aux_clm
fates suite outputs: /glade/u/home/glemieux/scratch/ctsm-tests/tests_ctsmdev105-merge

NOTE: Be sure to check your coding style against the standard
(https://github.com/ESCOMP/ctsm/wiki/CTSM-coding-guidelines) and review
the list of common problems to watch out for
(https://github.com/ESCOMP/CTSM/wiki/List-of-common-problems).

billsacks and others added 30 commits May 20, 2020 14:17
I think it's really ESCOMP#203 that's causing this failure, not
ESCOMP#158.
Update cime and cmeps externals; rework initialization of CNFire object

Main change is to update cime and cmeps externals. cime is now
essentially at the version used in cesm2_2_beta05 (but with one change
backed out); cmeps is at latest master.

Also, reworks initialization of CNFire object to avoid occasional
segmentation faults.
Generated from this branch by running:

make -f Makefile.data crop-smallville

after first building mksurfdata_map using an environment loaded from a
recent cheyenne_intel case.
It was identical to coldStart
This was the intent when I first created the test: The point of this
test is to test crop restart shortly after cold start. It later
accidentally turned into a test that used spun-up initial conditions
(when we changed just about everything to use spun-up initial
conditions); this commit restores it to being cold start.
I particularly want isotopes to be on for this test
Although gindex_ocn could be intent(out), intel18.0.3 generates a
runtime segmentation fault in runs that don't have this argument present
when this is declared intent(out). (It works fine on intel 19.0.2 when
declared as intent(out).)

Resolves ESCOMP#930
BFB according to the following tests:
ERS_Ly5_P144x1.f10_f10_musgs.IHistClm50BgcCrop.cheyenne_intel.clm-cropMonthOutput
ERP_D_Ld5.f10_f10_musgs.IHistClm50BgcCrop.cheyenne_intel.clm-allActive
ERS_D_Ld6.f10_f10_musgs.I1850Clm45BgcCrop.cheyenne_intel.clm-clm50CMIP6frc
clm_short test suite

Parameters moved:
prh30 (was hardcoded as 0.7 in CNFireLi2016Mod)
ignition_efficiency (was hardcoded as 0.22 in CNFireLi2016Mod)
tpu25ratio (was hardcoded as 0.167 in PhotosynthesisMod)
kp25ratio (was hardcoded as 20000 in PhotosynthesisMod)
bsw_adjustfactor (set to 1.0, did not previously exist in SoilStateInitTimeConstMod)
hksat_adjustfactor (set to 1.0, did not previously exist in SoilStateInitTimeConstMod)
sucsat_adjustfactor (set to 1.0, did not previously exist in SoilStateInitTimeConstMod)
watsat_adjustfactor (set to 1.0, did not previously exist in SoilStateInitTimeConstMod)
Gridcell-level balance checks for carbon and nitrogen

Bracket the model's time-step loop to calculate balance checks at the
gridcell level because there are terms in the carbon and nitrogen cycles
not accounted for at the column level. Balance checks at the column
level remain unchanged for now.

Resolves ESCOMP#314
Resolved Conflicts:
	src/biogeochem/CNProductsMod.F90
Resolves ESCOMP#1021 (at least, I think it will: I'm about to test
it)
Add two bioenergy crops (switchgrass and miscanthus)

Main change is from Yanyan Cheng: adding parameters and code for two
bioenergy crops, switchgrass and miscanthus. Along with this, there is a
new potential flux from crop leaves and stems to the crop product pool
at harvest, representing biofuel products; a new pft-specific parameter
controls this flux (biofuel_harvfrac). Currently, the out-of-the-box
surface datasets do not specify any area for these crops, but the new
parameter file will allow them to be present if specified on the surface
dataset or landuse_timeseries file. Note that this is only an option for
CLM5.0, NOT for CLM4.5. (See ESCOMP#884 for
details.)

Also, some minor fixes from Bill Sacks:
- Resolves ESCOMP#203 - Fixes creation of harvest-related variables
  on surface datasets created with the all_veg option - smallville,
  PTCLM, etc. (documented in ESCOMP#1019)
- Resolves ESCOMP#930 - Makes gindex_ocn intent(inout) rather than
  intent(out)
- Resolves ESCOMP#1021 - Changes SSP test to only do symlink if
  needed
Gridcell-level balance checks for carbon and nitrogen

Bracket the model's time-step loop to calculate balance checks at the
gridcell level because there are terms in the carbon and nitrogen cycles
not accounted for at the column level. Balance checks at the column
level remain unchanged for now.
At this point, it parses arguments and fills out the machine template
files, then stops.
…l for C24 and C96, also just use a generic decStart test and remove the special C96 test for running a Sp case fo rC96
This will be helpful for the lilac build script.

I have not yet tested this thoroughly!
Add two bioenergy crops (switchgrass and miscanthus)

Main change is from Yanyan Cheng: adding parameters and code for two
bioenergy crops, switchgrass and miscanthus. Along with this, there is a
new potential flux from crop leaves and stems to the crop product pool
at harvest, representing biofuel products; a new pft-specific parameter
controls this flux (biofuel_harvfrac). Currently, the out-of-the-box
surface datasets do not specify any area for these crops, but the new
parameter file will allow them to be present if specified on the surface
dataset or landuse_timeseries file. Note that this is only an option for
CLM5.0, NOT for CLM4.5. (See ESCOMP#884 for
details.)

Also, some minor fixes from Bill Sacks:
- Resolves ESCOMP#203 - Fixes creation of harvest-related variables
  on surface datasets created with the all_veg option - smallville,
  PTCLM, etc. (documented in ESCOMP#1019)
- Resolves ESCOMP#930 - Makes gindex_ocn intent(inout) rather than
  intent(out)
- Resolves ESCOMP#1021 - Changes SSP test to only do symlink if
  needed
This isn't yet fully working, but it's close
Resolved Conflicts:
	src/main/clm_initializeMod.F90
This was giving an error because ESMFMKFILE is no longer defined in the
environment on bishorn. Rather than fixing it, I'm stopping writing this
for now. My plan is to rework how this is written, writing it from the
top-level build_ctsm script instead of from within buildlib.
@glemieux

glemieux commented Sep 1, 2020

Copy link
Copy Markdown
Contributor Author

For the aux_clm suite, currently only one test is still PENDing in the Cheyenne queue. All other tests pass B4B with the exception of two FAILs:

  • LILACSMOKE_Vnuopc_D_Ld2.f10_f10_musgs.I2000Ctsm50NwpSpAsRsGs.cheyenne_intel.clm-lilac fails during MODEL_BUILD
  • SMS.ne0ARCTICGRISne30x8_ne0ARCTICGRISne30x8_mt12.ISSP585Clm50BgcCrop.cheyenne_intel.clm-clm50cam6LndTuningMode fails during RUN. I think this may be due to hitting the wallclock limit?

For the fates suite, all tests passed, although not B4B against the previous baseline generated off of the ctsm1.0.dev095 merge. This is not unexpected given that there were answer changing updates made during ctsm1.0.dev102 on master.

@glemieux glemieux linked an issue Sep 10, 2020 that may be closed by this pull request
@glemieux

Copy link
Copy Markdown
Contributor Author

Looking in the LILAC build failure, it looks like the project number isn't being passed from the run_sys_test wrapper for some reason. The build_ctsm.bldlog shows:

ERROR while running:
/glade/u/home/glemieux/ctsm/cime/scripts/create_newcase --output-root /glade/scratch/glemieux/ctsm-tests/tests_ctsmdev105-aux_clm/LILACSMOKE_Vnuopc_D_Ld2.f10_f10_musgs.I2000Ctsm50NwpSpAsRsGs.cheyenne_intel.clm-lilac.C.ctsmdev105-aux_clm_int/lilac_build --case /glade/scratch/glemieux/ctsm-tests/tests_ctsmdev105-aux_clm/LILACSMOKE_Vnuopc_D_Ld2.f10_f10_musgs.I2000Ctsm50NwpSpAsRsGs.cheyenne_intel.clm-lilac.C.ctsmdev105-aux_clm_int/lilac_build/case --compset I2000Ctsm50NwpSpAsRsGs --res f10_f10_musgs --compiler intel --driver nuopc --run-unsupported --machine cheyenne
...
No project info available
ERROR: PROJECT_REQUIRED is true but no project found

@billsacks

Copy link
Copy Markdown
Member

The arcticGris test hit the walltime limit. I tell this by looking at the TestStatus file, which shows RUN time=1222, and comparing that with:

$ ./xmlquery JOB_WALLCLOCK_TIME

Results in group case.st_archive
	JOB_WALLCLOCK_TIME: 00:20:00

I wouldn't worry about that test for the sake of testing your integration.

For the LILAC test, I think you uncovered a real issue that I will fix ASAP. (Thank you!) Again, you don't need to worry about that for the sake of the FATES testing.

@glemieux

Copy link
Copy Markdown
Contributor Author

Thanks @billsacks. I see that the ARTICGRIS wallclock limit is expected for ctsm1.0.dev105 from #1096. It looks like this was fixed #1079 with ccba4b2, which was included in ctsm1.0.dev106, so I figure we'll catch up to this later and can call this an expected failure for this PR.

@ekluzek ekluzek left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard to go through these long list of changes. But, it does look like the update to ctsm1.0.dev105. A lot of the changes that come in aren't ones that FATES would care about (new SE, and FV3 grids, and LILAC). But, there are also some important bug fixes that come in ctsm1.0.dev102. The CNFire change was good to bring in as it required some work on the FATES interface side. It was good to get that into FATES.

@billsacks

Copy link
Copy Markdown
Member

@glemieux the small diffs in #1146 should fix the LILAC test issue. If you have time and want to redo that one test with those diffs applied, I'd appreciate knowing if it works for you - but no worries if you don't have time to retry that right now.

@glemieux

glemieux commented Sep 11, 2020

Copy link
Copy Markdown
Contributor Author

@billsacks I manually added in the diff that you pointed to and the MODEL_BUILD get started, but it's failing elsewhere in the build now. Here's a partial copy from the atm_driver.bldlog:

error #7012: The module file cannot be read.  Its format requires a more recent F90 compiler.   [CTSM_LILACCOUPLINGFIELDINDICES]
  use ctsm_LilacCouplingFieldIndices
------^

I currently have intel/17.0.1 loaded in my modules. I'm going to try again with a more recent complier.

The test case was run via run_sys_test using -t option to specify the LILAC test. The log is here:

/glade/scratch/glemieux/ctsm-tests/tests_ctsmdev105-lilac/LILACSMOKE_Vnuopc_D_Ld2.f10_f10_musgs.I2000Ctsm50NwpSpAsRsGs.cheyenne_intel.clm-lilac.C.ctsmdev105-lilac/lilac_atm_driver/bld/atm_driver.bldlog

UPDATE: loading to intel/18.0.5 resolved the issue. Passes B4B.

@billsacks

Copy link
Copy Markdown
Member

@glemieux sorry for the continued issues. I'm glad you were able to work around this issue. I think I have also fixed it robustly in #1149 . I'll be interested to know if that fix works for you, too... though I suspect it will, since I was able to reproduce your issue by changing my environment to look like yours, and then #1149 got things working for me with your environment.

billsacks added a commit that referenced this pull request Sep 13, 2020
In LILACSMOKE test: Call case.load_env() before building the atm
driver. This is needed for the atm driver build to use the correct
module environment.

Things were working for me, and apparently for @ekluzek and @negin513 ,
without this change. But this is probably because our default module
environment was similar to cime's cheyenne_intel environment. @glemieux
was running into trouble because he has intel17 loaded in his default
environment (whereas the environment loaded by cime uses intel19). This
should fix that issue.

Are answers expected to change (and if so in what way)? Answers *may*
change for the LILACSMOKE test, if baselines were generated with a
slightly different compiler version than the one loaded by
cime. However, I do not expect answer changes.

Testing performed, if any:

- I first reproduced the error reported in
  #1137 (comment) by
  changing my environment to match @glemieux 's. Specifically, I
  replaced my `.profile` with his, and replaced the contents of my
  `.lmod.d` directory with his. So this used intel17. This led to the
  same issue with the LILACSMOKE test that he reported.

- I then reran the LILACSMOKE test using @glemieux 's environment but
  with the fix in this PR; now it passed.

- I also ran the LILACSMOKE test using my environment (which uses intel
  19.0.5) with this fix; it also passed.
@glemieux

Copy link
Copy Markdown
Contributor Author

@billsacks the addition in #1149 did the trick. I was able to get the LILAC test to pass b4b using my default environment. Thanks again!

@glemieux

Copy link
Copy Markdown
Contributor Author

I added in a fix in commit 75d2118 to the testmod definition for FatesHydro to address #1160.

@glemieux

glemieux commented Sep 25, 2020

Copy link
Copy Markdown
Contributor Author

Quick note that the NLCOMP DIFF is expected due to an update from ctsm1.0.dev.104 that removes the need to define dtime in the user_nl_clm file.

Most diffs against baseline are for cpl output. The following are the only test cases with diffs for clm2 output:

ERP_Ld9.f45_f45_mg37.I2000Clm50FatesCruGs.cheyenne_intel.clm-FatesAllVars.GC.ctsmdev105-merge_int
ERS_D_Lm12.1x1_brazil.I2000Clm50FatesCruGs.cheyenne_intel.clm-Fates_nat_and_anthro_ignitions.GC.ctsmdev105-merge_int
ERS_D_Ld5.f19_g16.I2000Clm50BgcCruGs.cheyenne_intel.clm-default.GC.ctsmdev105-merge_int

@glemieux

glemieux commented Oct 6, 2020

Copy link
Copy Markdown
Contributor Author

@rgknox the f10 test comparing the fates_main_api baseline and used acre to compare as suggested. The TLAI and ED_BIOMASS are bit for bit, which is not unexpected. Scanning the RMS differences in the output above seems to show that most of the variable differences are with respect to urban and snow related variables. The plots (and files used to generate the plots) can be found here in my google drive: https://drive.google.com/drive/folders/1N2SXW9V0KCHs8V6E9ENP7cOKBhoDjAF7?usp=sharing

Note I got the ERS_D_Lm12.1x1_brazil.I2000Clm50FatesCruGs.cheyenne_intel.clm-Fates_nat_and_anthro_ignitions test to run, but this time it failed in a different manner than expected (recall that it had failed by hitting the wallclock limit prior):

/glade/u/home/glemieux/scratch/clmed-tests/ctsmdev105-nutrient_premerge.fates-sci.1.42.0_api.14.0.0-ctsm1.0.dev105-Cb6f63958-F03a17bfe/ERS_D_Lm12.1x1_brazil.I2000Clm50FatesCruGs.cheyenne_intel.clm-Fates_nat_and_anthro_ignitions.C.20200930_224218_6dgabr

Weirdly when I ran the test on it's own, it passes the run but fails the exact restart (which is the expected failure):

/glade/u/home/glemieux/scratch/ctsm-tests/tests_ctsmdev105-nutrients-lightning/ERS_D_Lm12.1x1_brazil.I2000Clm50FatesCruGs.cheyenne_intel.clm-Fates_nat_and_anthro_ignitions.ctsmdev105-nutrients-lightning

That said, I think we probably can go ahead and integrate this. I'll try and address the anthro tests failure modes in a different PR.

@rgknox rgknox self-requested a review October 8, 2020 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FATES API update Changes to the FATES version that also REQUIRE an API change in CTSM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FatesHydro testmod not generating clm history output for tests less than a month CLMBuildNameList error for fates_main_api regression testmod

8 participants