-
Notifications
You must be signed in to change notification settings - Fork 16
Configuring and Running Model Simulations
The CEFI-regional-MOM6 repository contains the model source code, FRE workflow XMLs, and recommended configurations for various regional domains. If you're new to GitHub, you can check out this tutorial to learn the basics. This guide provides an overview of compiling and running the model on different supported platforms, including instructions for utilizing the GFDL FRE workflow on Gaea.
We offer a containerized solution for users interested in exploring the CEFI-regional-MOM6. For a quick start, users can refer to the Container-Based Quick Start Guide to run a 1D toy model case on their laptop/workstation.
- After logging into Gaea C6, navigate to your scratch folder and git clone the CEFI-regional-MOM6 repository:
> cd /gpfs/f6/ira-cefi/scratch/%USER
> git clone -b https://github.com/NOAA-GFDL/CEFI-regional-MOM6.git --recursive
- Compile the model:
> cd CEFI-regional-MOM6/builds
> ./linux-build.bash -m gaea -p ncrc6.intel23 -t repro -f mom6sis2
If the build completes successfully, you should be able to find the executable here:
builds/build/gaea-ncrc6.intel23/ocean_ice/repro/MOM6SIS2
- Run NWA12 Experiment on Gaea C6 using SLURM
> cd ../exps
> ln -fs /gpfs/f6/ira-cefi/world-shared/datasets ./
> cd NWA12.COBALT
> sbatch run.sub
For details on configuring the model and running the NWA12 example case on various platforms, please refer to this page.
Keep in mind C5/C6 currently doesn't support cross-site job submission. If you are on C5 and try to submit a C6 job, You'll need to switch to the C6 login node first:
ssh gaea66
After that, you can submit your SLURM job on C6 using the following example script:
#!/bin/bash
#SBATCH --nodes=5
#SBATCH --time=60
#SBATCH --job-name="NWA12.COBALT"
#SBATCH --output=NWA12.COBALT_o.%j
#SBATCH --error=NWA12.COBALT_e.%j
#SBATCH --qos=debug
#SBATCH --partition=batch
#SBATCH --clusters=c6
#SBATCH --account=ira-cefi
# Your command below
At GFDL, we utilize the FMS Runtime Environment (FRE) as a comprehensive toolset for managing experiments throughout their lifecycle. This includes tasks such as acquiring source code, compiling code, launching model runs, and post-processing outputs.
FRE consists of the following elements:
Experiment Suite XML File: containing information about how to acquire and compile the model sources, how to run the compiled model, and how to post-process its output data. Users need to create this file to describe their models (many examples have been provided).
Templates: for compiling and running models (FRE-generated scripts do not require modification).
Conventions: such as directory structure and naming.
Command Line Utilities: that read the experiment suite description file and perform a specific set of functions:
fremake
- checks out the model's code, creates/submits compiled scripts
frerun
- creates and submits runscripts
frepp
- creates and submits post-processing scripts
frecheck
- compare regression test runs
frelist
- lists existing experiments and details in your XML file
freppcheck
- reports missing post-processing files
The following diagram shows the relationship between frerun-created and frepp-created scripts:
The CEFI-regional-MOM6 repository includes recommended NWA12 configurations and example XML files. Below, we provide an overview of how to use the FRE workflow to conduct a 27-year retrospective run of the NWA12 model on Gaea C6.
- Compiling an Experiment on C6 The frelist tool can be used to list the names of experiments in your XML file:
> module use -a /ncrc/home2/fms/local/modulefiles
> module load fre/bronx-22
> cd CEFI-regional-MOM6/xmls/NWA12
> frelist -p ncrc6.intel23 -x CEFI_NWA12_cobalt.xml
MOM6_SIS2_GENERIC_4P_compile_symm
CEFI_NWA12_COBALT_V1 INHERITS FROM MOM6_SIS2_GENERIC_4P_compile_symm
As shown, there are two experiments listed: the first one is for model compilation.
To checkout and compile an experiment, use fremake:
fremake -x CEFI_NWA12_cobalt.xml -p ncrc6.intel23 -t prod MOM6_SIS2_GENERIC_4P_compile_symm
where -p
specifies the platform and compiler version, and -t
denotes the target option (default is prod
). Available options include debug
, repro
, prod
, hdf5
, and openmp
.
fremake
creates a checkout script in the
At the end of the generated output will be a command to submit the script to the batch scheduler. It will resemble:
TO SUBMIT => sbatch /home/$USER/.../experiment_name/.../exec/compile_experiment_name.csh
If you prefer, you can simply run this script in your interactive session.
To compile in batch mode, copy the submission command and run it. A job number will appear. The experiment is now in the process of compiling. If Slurm is the job scheduler you are using, then to check the status of this batch job you may do:
squeue -u $USER
- Running an Experiment After your compilation has finished successfully, the next step is to submit the run scripts using frerun. Two types of runs are available for most experiments, depending on the xml tags within the experiment's tag: regression runs and production runs. Regression runs are short runs typically executed for testing purposes. Production runs are full-length model runs that have the ability to continue over multiple batch jobs.
Execute the following for a regression run. The string "test" indicates that a model run will be created for each of the tags inside the tag with the attribute name="test".
frerun -r test -p ncrc6.intel23 -x CEFI_NWA12_cobalt.xml -t prod CEFI_NWA12_COBALT_V1
Execute the following for a production run:
frerun -p ncrc6.intel23 -x CEFI_NWA12_cobalt.xml -t prod CEFI_NWA12_COBALT_V1
As with fremake, a submit command will appear at the end of the frerun command's output.
TO SUBMIT => sbatch /home/$USER/.../experiment_name/.../scripts/run/experiment_name
As with fremake, you can run interactively by running the script directly on the machine or partition for running jobs. You would need to initiate an interactive session on the compute partition with access to enough processors to run the job. Otherwise, to run in batch mode, copy the job submission command and execute it. After that, a job number will appear. The experiment is now in the process of running. To check the status of this job you can use squeue
.
For more detailed information on the FRE workflow, users can refer to the FRE tutorial page, which provides valuable insights.