Skip to content

ywang-bom/sdm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Extraction Tool for Statistical Downscaling Model

The data extraction (DXT) tool for Statistical Downscaling Model (SDM) are a set of Python command-line scripts that generates reconstructed daily climate series for a region-of-interest (RoI) based on contents of the given Change-of-Date (CoD) file and AWAP local observation dataset.

Scope

The DXT tool focuses on the raw data extraction part without any post-processing. The following steps describes the logical process for running the DXT tool:

  1. An user specifies a set of parameters that are required to locate the corresponding CoD file:
    • Model (specify one model from a pre-defined list of 22 CMIP5 models)
    • Scenario (specify one scenario from a pre-defined list of 3 scenarios)
    • Region-Type (specify one region-type from a pre-defined list of 10 climate regions)
    • Season (specify one season from a pre-defined list of 4 seasons)
    • Predictand (specify one predictand from a pre-defined list of 3 predictands)
  2. The path to the CoD file is generated based on inputs from the previous step and the base directory of CoD files.
  3. The reconstructed climate series of a given RoI is generated by picking data from the AWAP dataset according to contents of the CoD File.
    • The RoI is optional. If missing, it will be set to the region-type. Otherwise it shall be given in the same format of region-type mask files (NetCDF files containing array of Zeros and Ones).
  4. The output data is saved as a NetCDF file.
    • The NetCDF file is CF-compliant.
    • For RoI that is not of rectangle shape, a minimum enclosing rectangle area will be used padding with missing values.

Any data post-processings, e.g. subsetting (date and region), merging, inflation, tail distribution correction, and visualisation, are out of scope of the DXT tool. In Vistrails's terminology, these post-processing steps are separate modules.

Design

For better modularity, the DXT tool itself will provide two modules for the Vistrails system.

The first module takes user inputs of model, scenario, region-type, season and predictand and outputs the path to the corresponding CoD file (essentially covers step 1 and 2 in above listing).

The second module takes the path to the CoD file and input. It then reads the specified CoD file, reconstruct the climate data series for a given RoI and outputs the data as a CF-compliant NetCDF file (essentially covers step 3 and 4 in above listing).

As the output NetCDF file is CF-compliant, it will be possible to use various existing NetCDF processing tool (e.g. cdo, nco) for subsettings and aggregations. A few SDM specific post-processings, e.g. inflation, may not be possible to achieve with existing 3rd party tools and hence require dedicate Python scripts to be developed. They are by design out of scope of the DXT tool. However, if time permits, the inflation post-processing will be looked into further.

Python Version and Modules

  • Python version 2.7+
  • numpy 1.8.0+
  • scipy 0.14.0+ (needed for NetCDF I/O and most data post-processing modules except the simplest ones)

The code is developed on a local machine and its final running environment should be one of the NCI machines. It can currently run with the Python 2.7.6 installation on raijin.

Possible Issues

There maybe some memory issue on running the DXT tool as it requires large memory to extract data for large regions (e.g. nmr, qld). Data post-processing like inflation could be even more memory intensive. The tool will only guarantee to work with smaller regions as memory issue is more hardware and operating system related and cannot be easily solved in the code itself.

Usage

The functionality of the tool is packaged as a Python module called sdm. An command line Python script, sdmrun.py is developed to interface with the module. The general form to usesdmrun.py is shown as follows:

python sdmrun.py SUB-COMMAND [OPTIONS]

Type and run python sdmrun.py -h shows more help messages. For a specific sub-command, type and run python sdmrun.py SUB-COMMAND -h to show its specialized help messages.

Configurations

Some system wide information, e.g. base directory of the CoD files, are required before the tool can work properly. These information are provided as configuration file. A sample configuration file is as follows:

[dxt]
cod_base_dir=/path/to/the/cod/files
mask_base_dir=/path/to/the/mask/netcdf/files
gridded_base_dir=/path/to/the/awap/daily/dataset

The configuration can be specified on command line via the -c flag. If missing, the tool searches for a file called .sdm.cfg under user's home directory.

Sub-Commands

There are currently three sub-commands and they are described as follows:

  • cod-getpath Returns path to the CoD file according to the given model, scenario, region-type, season and predictand, e.g.:

    python sdmrun.py cod-getpath -m ACCESS1.0 -c historical -r tas -s 2 -p rain
  • dxt-gridded Generates the reconstructed climate series using the given CoD filename. The output NetCDF must be specified in order to save the data, e.g.:

    python sdmrun.py dxt-gridded /path/to/a/CoD/File out.nc
  • dxt-gridded2 Similar to above sub-command, but takes parameters that generates a file path to the CoD file instead of the CoD filename directly, e.g:

    python sdmrun.py dxt-gridded2 -m ACCESS1.0 -c historical -r tas -s 2 -p rain out.nc
  • to-3d Convert 2D data from a downscaling output NetCDF file to 3D and save in a new NetCDF file. The 2D data is of format [dates, points] and the 3D data is of format [time, lat, lon]. The new NetCDF file is CF-compliant.

    python sdmrun.py to-3d /path/to/a/downscaling/output/netcdf/file region_mask_name out.nc

Appendix

List of Pre-defined Variables

Models (22)

  • ACCESS1.0
  • ACCESS1.3
  • BNU-ESM
  • CCSM4
  • CMCC-CMS
  • CNRM-CM5
  • CSIRO-Mk3.6.0
  • CanESM2
  • GFDL-ESM2G
  • GFDL-ESM2M
  • HadGEM2-CC
  • IPSL-CM5A-LR
  • IPSL-CM5A-MR
  • IPSL-CM5B-LR
  • MIROC-ESM-CHEM
  • MIROC-ESM
  • MIROC5
  • MPI-ESM-LR
  • MPI-ESM-MR
  • MRI-CGCM3
  • NorESM1-M
  • bcc-csm1-1-m

Scenarios (3)

  • historical
  • rcp45
  • rcp85

Region-Types (10)

  • mec
  • nmr
  • nul
  • nwa
  • qld
  • sea
  • sec
  • smd
  • swc
  • tas

Seasons (4)

  • DJF
  • MAM
  • JJA
  • SON

Predictands (3)

  • rain
  • tmin
  • tmax

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages