Conversation
| "stac_extensions": [ | ||
| #"https://stac-extensions.github.io/cmip6/v3.0.0/schema.json", | ||
| "https://esgf.github.io/stac-transaction-api/cmip6/v1.0.0/schema.json", | ||
| "https://stac-extensions.github.io/cmip6/v3.0.0/schema.json", |
There was a problem hiding this comment.
Have we tested/confirmed if this schema is working with our data?
There was a problem hiding this comment.
@sturoscy-personal Not sure it was ever formally confirmed.
What I can say is that this is still the version currently applied in the latest CMIP6 schema we released a few days ago:
https://esgvoc.ipsl.fr/api/v1/apps/jsg/cmip6
This latest schema works correctly and be fully compatible with the test payloads generated using this PR.
I believe the updated schema should also be pushed to the STAC CMIP6 extension repository.
Would you like us to open a PR there as well?
There was a problem hiding this comment.
I used the PR to regenerate 88k STAC Items and validated each of them with stac-validator using the new schema at https://esgvoc.ipsl.fr/api/v1/apps/jsg/cmip6. 71% of the Items validated successfully; 29% failed.
Here are the first 10 Items that failed validation::
- esgfng-payloads/CMIP6.AerChemMIP.NOAA-GFDL.GFDL-ESM4.hist-piAer.r1i1p1f1.AERmon.wetnoy.gr1.v20180701_eagle.alcf.anl.gov.json
"error_message": "'Wet Deposition Rate of NOy including Aerosol Nitrate' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
- esgfng-payloads/CMIP6.C4MIP.MIROC.MIROC-ES2L.1pctCO2Ndep-bgc.r1i1p1f2.Amon.rsds.gn.v20191129_eagle.alcf.anl.gov.json
"error_message": "'Surface Downwelling Shortwave Radiation' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
- esgfng-payloads/CMIP6.CMIP.CAS.FGOALS-g3.1pctCO2.r2i1p1f1.day.psl.gn.v20191223_eagle.alcf.anl.gov.json
"error_message": "'air_pressure_at_mean_sea_level' is not one of <snap>. Error is in properties -> cmip6:variable_cf_standard_name "
- esgfng-payloads/CMIP6.CFMIP.CNRM-CERFACS.CNRM-CM6-1.amip-4xCO2.r1i1p1f2.AERmon.rsutcsaf.gr.v20190820_eagle.alcf.anl.gov.json
"error_message": "'toa outgoing clear-sky shortwave radiation' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
- esgfng-payloads/CMIP6.AerChemMIP.NIMS-KMA.UKESM1-0-LL.ssp370-lowNTCF.r1i1p1f2.Amon.rsds.gn.v20201020_eagle.alcf.anl.gov.json
"error_message": "'Surface Downwelling Shortwave Radiation' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
- esgfng-payloads/CMIP6.C4MIP.NASA-GISS.GISS-E2-1-G.1pctCO2-rad.r101i1p1f1.Emon.cSoilTree.gn.v20190815_eagle.alcf.anl.gov.json
"error_message": "'soil_carbon_content' is not one of <snap>. Error is in properties -> cmip6:variable_cf_standard_name "
- esgfng-payloads/CMIP6.AerChemMIP.NOAA-GFDL.GFDL-ESM4.ssp370pdSST.r1i1p1f1.Emon.vegHeight.gr1.v20180701_eagle.alcf.anl.gov.json
"error_message": "'canopy height' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
- esgfng-payloads/CMIP6.C4MIP.NASA-GISS.GISS-E2-1-G-CC.ssp585-bgc.r1i1p1f1.Omon.fbddtalk.gn.v20190815_eagle.alcf.anl.gov.json
"error_message": "'ocnBgChem' is not one of ['seaIce', 'ocean', 'aerosol', 'land', 'landIce', 'atmos', 'ocnBgchem', 'atmosChem']. Error is in properties -> cmip6:realm -> 0 "
- esgfng-payloads/CMIP6.AerChemMIP.NOAA-GFDL.GFDL-ESM4.piClim-2xdust.r1i1p1f1.AERmon.pan.gr1.v20180701_eagle.alcf.anl.gov.json
"error_message": "'PAN volume mixing ratio' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
- esgfng-payloads/CMIP6.CFMIP.IPSL.IPSL-CM6A-LR.amip-p4K-lwoff.r1i1p1f1.Amon.pfull.gr.v20180928_eagle.alcf.anl.gov.json
"error_message": "'Pressure on Model Levels' is not one of <snap>. Error is in properties -> cmip6:variable_long_name "
It appears that 3 properties defined as enum in the new schema are missing many values. I checked the first 1,000 failing Items and the errors are distributed across the 3 properties as follows:
824 "Error is in properties -> cmip6:variable_long_name "
140 "Error is in properties -> cmip6:variable_cf_standard_name "
36 "Error is in properties -> cmip6:realm -> 0 "
I guess the schema, https://esgvoc.ipsl.fr/api/v1/apps/jsg/cmip6, was generated based on a small subset of CMIP6 data.
Hi @lukaszlacinski I propose some minor changes in the convert2stac method according to the CMIP6 JSON schema.