Skip to content

Latest commit

 

History

History
203 lines (164 loc) · 9.01 KB

OLMT_notes.md

File metadata and controls

203 lines (164 loc) · 9.01 KB

Running OLMT

Table of contents

  1. Helpful weblinks
  2. Submitting jobs using OLMT
  3. Testing single simulation with no Uncertainty Quantification
  4. Generating parameter sample file to run ensemble
  5. Performing post-processing only
  6. OLMT job with dynamic pft
  7. Notes about OLMT

1. Helpful weblinks

2. Submitting jobs using OLMT

2.1. Ensemble run

python site_fullrun.py  
--site US-Bo1 \
--crop \
--caseidprefix 20200428_soybean \
--nyears_ad_spinup 200 \
--nyears_final_spinup 200 \
--tstep 1 \
--cpl_bypass \
--machine anvil \
--model_root ~/E3SM \
--nopftdyn \
--gswp3 \
--ng 400 \
--parm_list parm_list_cropUQ \
--sitegroup CropUQ \
--ensemble_file mcsamples_cropUQ_soybean_2000x18.txt \
--postproc_file postproc_vars_crop

Above line but with comments explaining each option:

python site_fullrun.py 
--site US-Bo1                     # 6-character fluxnet site name
--crop                            # Crop model simulation is OFF
--caseidprefix 20200428_soybean   # Unique identifier to include as a prefix to the case name
--nyears_ad_spinup 200            # number of years to run ad_spinup
--nyears_final_spinup 200         # base no. of years for final spinup
--tstep 1                         # CLM timestep (hours)
--cpl_bypass                      # Bypass coupler is OFF
--machine anvil                   # machine to use
--model_root ~/E3SM               # base E3SM directory
--nopftdyn                        # Do not use dynamic PFT file
--gswp3                           # Use GSWP3 meteorology
--ng 720                          # number of groups to run in ensemble mode
--parm_list parm_list_cropUQ      # File containing list of parameters to vary
--sitegroup CropUQ                # site group to use
--ensemble_file mcsamples_cropUQ_soybean_2000x18.txt  # Parameter sample file to generate ensemble
--postproc_file postproc_vars_crop                    # File for ensemble post processing

2.2. Testing single simulation with no Uncertainty Quantification:

python site_fullrun.py \
--site US-Bo1 \
--crop \
--caseidprefix 20200428_soybean \
--nyears_ad_spinup 200 \
--nyears_final_spinup 200 \
--tstep 1 \
--cpl_bypass \
--machine anvil \
--model_root ~/E3SM \
--nopftdyn \
--gswp3 \
--sitegroup CropUQ

2.3. Generating parameter sample file to run ensemble

If parameter sample file file does not exist it can be generated by using option mc_ensemble. Example: --mc_ensemble 2000

2.4 OLMT job with dynamic pft

Remove the --nopftdyn option in the call to site_fullrun.

2.5 Perform single simulation using OLMT:

Remove the following options in the call to site_fullrun:

  • --mc_ensemble
  • --ng
  • --parm_list

2.6 Utilize custom parameter file instead of the default ones:

Specify the path to your custom paramter file using option --mod_parm_file <filename>

2.7. Performing post-processing only

For performing post processing with modified options first launch an interactive session (srun -A condo -p acme-small -N 1 -t 30 --pty bash) and then run the following command:

srun -n 36   python manage_ensemble.py \
--postproc_only True \
--case 20210730_corn_soybean_US-UiC_ICBELMCNCROP_trans \
--runroot /lcrc/group/acme/ac.eva.sinha/ \
--n_ensemble 100 \
--ens_file mcsamples_20210730_corn_soybean_100.txt \
--exeroot /lcrc/group/acme/ac.eva.sinha/20210730_corn_soybean_US-UiC_ICBELMCNCROP_trans/bld \
--parm_list parm_list_corn_soybean \
--cnp True \
--site US-UiB \
--postproc_file postproc_vars_crop \
--model_name elm

The extra spaces between 36 and python is not a mistake (I was getting error message without the extra space).

7. Notes about OLMT:

  • parm_list file lists input parameter and their ranges (ex. parm_list_cropUQ).
  • The parm_list file is used to create Nxd samples file, where d is the number of parameters and N is the number of desired samples to run.
  • In the parm_list file:
    • If the parameter is not PFT-specific of if you want to use the same number for all PFTs, enter 0 for PFT number.
    • The same parameter name could appear multiple times in this file for different PFTs.
    • Names much match the netcdf parameter file exactly.
  • Python scripts are used to create site-specific surface and domain data, and to create, configure, build and submit the relevant cases.
    • Script for creating surface and domain data - makepointdata.py.
    • Surface data domain data file created in temp/ folder.
  • The coupler bypass crop compset (ICBELMCNCROP) does not exist in the default ELM directory. Follow the steps for copying config_compsets.xml and config_machines.xml to respective directory in ELM folder:
    • Compy:
     cp /qfs/people/ricc364/models/ELM_crop/E3SM/components/clm/cime_config/config_compsets.xml ./components/elm/cime_config/
     cp /qfs/people/ricc364/models/clean/E3SM/cime/config/e3sm/machines/config_machines.xml ./cime_config/machines/
    
    • Anvil:
     cp /home/ac.ricciuto/models/ELM_crop/E3SM/components/clm/cime_config/config_compsets.xml ./components/elm/cime_config/
     cp /home/ac.ricciuto/models/ELM_crop/E3SM/cime/config/e3sm/machines/config_machines.xml ./cime_config/machines/
    

Post-processing for ensemble runs:

  • Postprocessing of the output is performed automatically based on informatin in file ~/OLMT/postproc_vars_crop
  • This file lists the variables for which the post processing is performed, the period over which the variables should be averaged, and units for conversion.
  • The post processing is only applied to the transient case in the full run.
  • The post-processed output is located in: ~/OLMT/UQ_output/<caseid>
  • The post-processed output file (20200428_soybean_US-Bo1_ICBCLM45CNCROP_trans_postprocessed.txt) contains n*m matrix of outputs where n is the number of post processing variables listed in postproc_vars file and m is the number of samples in the ensemble.
  • ytrain.dat has the 80% of the post-processed output file data and yval.dat has the remaining 20%.
  • ptrain.dat has the 80% of the input parameter data and pval.dat has the remaining 20%.
  • foreden.csv has combined input parameter (ptrain.dat + pval.dat) and output file data (ytrain.dat + yval.dat) data (horizontally combined).

Adding new sites

Note that new sites (see file locations below) can be added by creating entries in site data files:

CropUQ_sitedata.txt    (location, years of data)
CropUQ_soildata.txt    (soil texture information)
CropUQ_pftdata.txt     (PFT information)

Running jobs in batch

  • OLMT runs the aceelerated spin case, final spinup, and transient runs by submitting jobs using dependency option. Example
sbatch scripts/20210615_corn_soybean/ensemble_run_20210615_corn_soybean_US-UiC_ICBELMCNCROP_ad_spinup.pbs > temp/jobinfo
sbatch --dependency=afterok:459039 scripts/20210615_corn_soybean/ensemble_run_20210615_corn_soybean_US-UiC_ICBELMCNCROP.pbs > temp/jobinfo
sbatch --dependency=afterok:459040 scripts/20210615_corn_soybean/ensemble_run_20210615_corn_soybean_US-UiC_ICBELMCNCROP_trans.pbs > temp/jobinfo

Directory and file locations

  • Anvil:
    • Surface data file - /lcrc/group/e3sm/ccsm-data/inputdata/lnd/clm2/surfdata_map/
    • Parameter data file - /lcrc/group/e3sm/ccsm-data/inputdata/lnd/clm2/paramdata/
    • Site data files - /lcrc/group/e3sm/ccsm-data/inputdata/lnd/clm2/PTCLM/
    • Public input files - /lcrc/group/acme/public_html/inputdata
    • Scratch - /lcrc/group/e3sm/userid/scratch/anvil
    • CASE directory - /gpfs/fs1/home/userid/OLMT/cime_case_dirs/
    • CASE exeroot/rundir - /lcrc/group/acme/userid/
    • CASE outputs - /lcrc/group/acme/userid/UQ/CASEID
  • Compy:
    • Surface data file - /compyfs/inputdata/lnd/clm2/surfdata_map
    • Parameter data file - /compyfs/inputdata/lnd/clm2/paramdata
    • Site data files - /compyfs/inputdata/lnd/clm2/PTCLM/

Dynamic surface data set creation using SiteID_dynpftdata.txt file

  • Sum of PCT_CFT should always be 1.

Surface data set creation using CropUQ_pftdata.txt file

  • Sum of PCT_CFT and PCT_NAT_PFT should always be 1.
  • Turn natural vegetation or crops on or off using PCT_NATVEG and PCT_CROP.
  • When creating surfdata for a single grid using _pftdata.txt file be careful about using pft 0 - there could be more than one value for pft 0 and the later value will override the earlier values.

Useful git commands