Skip to content

Workflow: Science Development

Matthew Thompson edited this page Nov 11, 2020 · 13 revisions

In this process, the scientist starts from an approved tag (e.g. v10.16.0) on some top level fixture (e.g. GEOSgcm), and wishes all science aspects to remain frozen as they develop some new feature. Changes are required in components GEOSgcm_GridComp and GEOSgcm_App.

A typical mepo workflow can be as follows:

Cloning with mepo

Clone top level fixture with All-in-One mepo clone

$ mepo clone -b v10.16.3 [email protected]:GEOS-ESM/GEOSgcm.git v10.16.3
Initializing mepo using components.yaml
env                    | (t) v3.0.1
cmake                  | (t) v3.2.1
ecbuild                | (t) geos/v1.0.5
NCEP_Shared            | (t) v1.0.1
GMAO_Shared            | (t) v1.3.3
MAPL                   | (t) v2.3.4
FMS                    | (t) geos/2019.01.02+noaff.1
GEOSgcm_GridComp       | (t) v1.11.2
FVdycoreCubed_GridComp | (t) v1.2.5
fvdycore               | (t) geos/v1.1.3
GEOSchem_GridComp      | (t) v1.4.1
mom                    | (t) geos/5.1.0+1.1.1
mom6                   | (t) geos/v1.1.0
GEOSgcm_App            | (t) v1.3.10
UMD_Etc                | (t) v1.0.3
CPLFCST_Etc            | (t) v1.0.1

Separate Commands

Note that the mepo clone -b command is actually a combination of a few mepo commands. What this does is:

  1. Clone from git (with the given tag at -b)
  2. Change directory into clone
  3. Run mepo init
  4. Run mepo clone
  5. Change back down directory
$ git clone -b v10.16.3 [email protected]:GEOS-ESM/GEOSgcm.git v10.16.3
$ cd v10.16.3
$ mepo init
Initializing mepo using components.yaml
$ mepo clone
env                    | (t) v3.0.1
cmake                  | (t) v3.2.1
ecbuild                | (t) geos/v1.0.5
NCEP_Shared            | (t) v1.0.1
GMAO_Shared            | (t) v1.3.3
MAPL                   | (t) v2.3.4
FMS                    | (t) geos/2019.01.02+noaff.1
GEOSgcm_GridComp       | (t) v1.11.2
FVdycoreCubed_GridComp | (t) v1.2.5
fvdycore               | (t) geos/v1.1.3
GEOSchem_GridComp      | (t) v1.4.1
mom                    | (t) geos/5.1.0+1.1.1
mom6                   | (t) geos/v1.1.0
GEOSgcm_App            | (t) v1.3.10
UMD_Etc                | (t) v1.0.3
CPLFCST_Etc            | (t) v1.0.1
$ cd ..

mepo reads a list of components (e.g. components.yaml) and saves it as an internal state. All subsequent mepo commands work off of that saved state. If the default config file, components.yaml, is not present in the directory from where mepo is run, one can run mepo init /path/to/mepo/config/file. Running mepo clone then clones all the "subcomponents" that are included in components.yaml.

mepo status

One of the most useful commands is mepo status which presents the current status of your multi-repository checkout. Running this now will produce output much like the output seen at the end of the mepo clone above:

$ mepo status
Checking status...
GEOSgcm                | (t) v10.16.3 (DH)
env                    | (t) v3.0.1 (DH)
cmake                  | (t) v3.2.1 (DH)
ecbuild                | (t) geos/v1.0.5 (DH)
NCEP_Shared            | (t) v1.0.1 (DH)
GMAO_Shared            | (t) v1.3.3 (DH)
MAPL                   | (t) v2.3.4 (DH)
FMS                    | (t) geos/2019.01.02+noaff.1 (DH)
GEOSgcm_GridComp       | (t) v1.11.2 (DH)
FVdycoreCubed_GridComp | (t) v1.2.5 (DH)
fvdycore               | (t) geos/v1.1.3 (DH)
GEOSchem_GridComp      | (t) v1.4.1 (DH)
mom                    | (t) geos/5.1.0+1.1.1 (DH)
mom6                   | (t) geos/v1.1.0 (DH)
GEOSgcm_App            | (t) v1.3.10 (DH)
UMD_Etc                | (t) v1.0.3 (DH)
CPLFCST_Etc            | (t) v1.0.1 (DH)

Because we cloned a fixed tag of GEOSgcm, we are on tags ((t)). Other things you might see are branches ((b)) and hashes ((h)). Note that all are on a detached head ((DH)) state. But as you make operations on the repos, this output will change (see below).

Note that most mepo commands other than init and clone will work anywhere in the multi-repo checkout.

Building the model

At this point, you have the model as it is defined in components.yaml. You can build the model now or at any stage. This can be done either with parallel_build.csh or with a manual CMake/make procedure.

Using parallel_build.csh

As usual, to use parallel_build.csh you can run parallel_build.csh. To build with debugging, you can use parallel_build.csh -debug.

Manual CMake/Make

Running CMake

To build manually on, say, a compute node you can run:

$ mkdir build
$ cd build
$ cmake .. -DBASEDIR=$BASEDIR/Linux -DCMAKE_Fortran_COMPILER=ifort -DCMAKE_INSTALL_PREFIX=../install
...lots of CMake output

This tells CMake you want to install to ../install. (The -DCMAKE_Fortran_COMPILER=ifort is not strictly required on NCCS Discover. But at NAS, it is due to differences in modulefiles.)

Building with Debugging

To tell CMake you want to build with debugging flags add: -DCMAKE_BUILD_TYPE=Debug. By default, GEOS will use Release which is optimized.

Running Make

Once CMake is complete, you can now build with GNU Make by running:

$ make -jN install

where N is the number of parallel jobs. Note that it is preferable to build on a compute node when doing so interactively. If you build on a head node, you should limit yourself to make -j3 or less as higher numbers can have a deleterious effect on other users. On a compute node, GEOS scales pretty well to, say, make -j10 install but there is little benefit past that (and can often get slower slightly due to memory contention).

Create and checkout new feature branches

Now, after you run mepo clone you will be in a "detached head" state in all components. This was done to echo both how manage_externals operates as well as how CVS works. When you checkout a tag in CVS, you are on a "sticky tag" which cannot be committed to. In CVS you must checkout a branch to commit on and so must you with mepo.

Create feature branches

GEOS recommends the use of Git-Flow style branch naming in the style of:

branchtype/username/description

The username is self-explanatory and is mainly there for easy determination of who a branch might belong to (as branches often can become stale or abandoned).

Branch Type

There are three general branch types:

  1. feature

    Feature branches introduce new features to the model. They should branch from develop and merge back to develop.

  2. bugfix

    Bugfix branches are for bugs that are in develop and help the Gatekeepers better recognize that this is a bug and should be taken care of. Like feature branches, you branch from develop and merge to develop.

  3. hotfix

    Hotfix branches are a (hopefully) rare and special type of bugfix branch. Hotfixes are done against main as they represent a bug that even current, stable releases have. In this case, you branch from main (so you don't run mepo develop) and merge into main. Then you usually immediately merge main into develop as hotfixes are usually bugs in all branches!

Description

The description is mainly for to help identify the purpose of a feature branch. We recommend that users first file an Issue on the repositories where the feature is going. You can then prepend your feature branch description with the issue number. So let's say you open an issue and it is issue #345, then you can make a branch name like:

feature/mathomp4/#345-awesome-new-feature

For example, now we will checkout, from detached head, a new feature branch feature/<username>/feature-dev in components GEOSgcm_GridComp and GEOSgcm_App

$ mepo checkout -b feature/mathomp4/#345-awesome-new-feature GEOSgcm_GridComp GEOSgcm_App
+ GEOSgcm_GridComp: feature/mathomp4/#345-awesome-new-feature
+ GEOSgcm_App: feature/mathomp4/#345-awesome-new-feature

The above command is equivalent to running the following two commands

$ mepo branch create feature/mathomp4/#345-awesome-new-feature GEOSgcm_GridComp GEOSgcm_App
$ mepo checkout feature/mathomp4/#345-awesome-new-feature GEOSgcm_GridComp GEOSgcm_App

If you now run mepo status, you'll see:

$ mepo status
Checking status...
GEOSgcm                | (t) v10.16.3 (DH)
env                    | (t) v3.0.1 (DH)
cmake                  | (t) v3.2.1 (DH)
ecbuild                | (t) geos/v1.0.5 (DH)
NCEP_Shared            | (t) v1.0.1 (DH)
GMAO_Shared            | (t) v1.3.3 (DH)
MAPL                   | (t) v2.3.4 (DH)
FMS                    | (t) geos/2019.01.02+noaff.1 (DH)
GEOSgcm_GridComp       | (b) feature/mathomp4/#345-awesome-new-feature
FVdycoreCubed_GridComp | (t) v1.2.5 (DH)
fvdycore               | (t) geos/v1.1.3 (DH)
GEOSchem_GridComp      | (t) v1.4.1 (DH)
mom                    | (t) geos/5.1.0+1.1.1 (DH)
mom6                   | (t) geos/v1.1.0 (DH)
GEOSgcm_App            | (b) feature/mathomp4/#345-awesome-new-feature
UMD_Etc                | (t) v1.0.3 (DH)
CPLFCST_Etc            | (t) v1.0.1 (DH)

Notice how GEOSgcm_GridComp and GEOSgcm_App are now changed to new branches ((b)) with no detached head. (In an actual console with colors available, the repo names will be colorized as red.)

mepo compare

Another command worth mentioning is mepo compare which compares the current state of the checkout to that of the original (or last saved, see below) state:

$ mepo compare
Repo                   | Original                         | Current
---------------------- | -------------------------------- | -------
GEOSgcm                | (t) v10.16.3 (DH)                | (t) v10.16.3 (DH)
env                    | (t) v3.0.1 (DH)                  | (t) v3.0.1 (DH)
cmake                  | (t) v3.2.1 (DH)                  | (t) v3.2.1 (DH)
ecbuild                | (t) geos/v1.0.5 (DH)             | (t) geos/v1.0.5 (DH)
NCEP_Shared            | (t) v1.0.1 (DH)                  | (t) v1.0.1 (DH)
GMAO_Shared            | (t) v1.3.3 (DH)                  | (t) v1.3.3 (DH)
MAPL                   | (t) v2.3.4 (DH)                  | (t) v2.3.4 (DH)
FMS                    | (t) geos/2019.01.02+noaff.1 (DH) | (t) geos/2019.01.02+noaff.1 (DH)
GEOSgcm_GridComp       | (t) v1.11.2 (DH)                 | (b) feature/mathomp4/#345-awesome-new-feature
FVdycoreCubed_GridComp | (t) v1.2.5 (DH)                  | (t) v1.2.5 (DH)
fvdycore               | (t) geos/v1.1.3 (DH)             | (t) geos/v1.1.3 (DH)
GEOSchem_GridComp      | (t) v1.4.1 (DH)                  | (t) v1.4.1 (DH)
mom                    | (t) geos/5.1.0+1.1.1 (DH)        | (t) geos/5.1.0+1.1.1 (DH)
mom6                   | (t) geos/v1.1.0 (DH)             | (t) geos/v1.1.0 (DH)
GEOSgcm_App            | (t) v1.3.10 (DH)                 | (b) feature/mathomp4/#345-awesome-new-feature
UMD_Etc                | (t) v1.0.3 (DH)                  | (t) v1.0.3 (DH)
CPLFCST_Etc            | (t) v1.0.1 (DH)                  | (t) v1.0.1 (DH)

Other commands involving branches are:

πŸ‘‰ mepo branch list prints local branches of all components
πŸ‘‰ mepo branch list -a prints local and remote branches of all components
πŸ‘‰ mepo branch delete <branch-name> GEOSgcm_GridComp GEOSgcm_App deletes branch <branch-name> of the specified components

Develop and commit changes on feature branches

Much like regular git workflow, you now will make your changes and then stage-and-commit them. So let's say you've made changes. If you run mepo status you'll see what you've done:

$ mepo status
Checking status...
GEOSgcm                | (t) v10.16.3 (DH)
env                    | (t) v3.0.1 (DH)
cmake                  | (t) v3.2.1 (DH)
ecbuild                | (t) geos/v1.0.5 (DH)
NCEP_Shared            | (t) v1.0.1 (DH)
GMAO_Shared            | (t) v1.3.3 (DH)
MAPL                   | (t) v2.3.4 (DH)
FMS                    | (t) geos/2019.01.02+noaff.1 (DH)
GEOSgcm_GridComp       | (b) feature/mathomp4/#345-awesome-new-feature
   | GEOS_GcmGridComp.F90: modified, not staged
FVdycoreCubed_GridComp | (t) v1.2.5 (DH)
fvdycore               | (t) geos/v1.1.3 (DH)
GEOSchem_GridComp      | (t) v1.4.1 (DH)
mom                    | (t) geos/5.1.0+1.1.1 (DH)
mom6                   | (t) geos/v1.1.0 (DH)
GEOSgcm_App            | (b) feature/mathomp4/#345-awesome-new-feature
   | AGCM.rc.tmpl: modified, not staged
UMD_Etc                | (t) v1.0.3 (DH)
CPLFCST_Etc            | (t) v1.0.1 (DH)

Notice it says you have modified, not staged changes in the two repos.

If you would like a diff, you can run mepo diff which runs git diff on the subrepos:

$ mepo diff
Diffing...
GEOSgcm_GridComp (location: src/Components/@GEOSgcm_GridComp):

diff --git a/GEOS_GcmGridComp.F90 b/GEOS_GcmGridComp.F90
index f66835a..5bde89e 100644
--- a/GEOS_GcmGridComp.F90
+++ b/GEOS_GcmGridComp.F90
@@ -165,6 +165,8 @@ contains

     call MAPL_GetObjectFromGC ( GC, MAPL, RC=STATUS)
     VERIFY_(STATUS)
+    call MAPL_GetResource ( MAPL, NEW_FEATURE, Label="NEW_FEATURE:" , DEFAULT=.FALSE., RC=STATUS)
+    VERIFY_(STATUS)

 ! Get constants from CF
 ! ---------------------
─────────────────────────────────────────────────────────────────────────────────────────
GEOSgcm_App (location: src/Applications/@GEOSgcm_App):

diff --git a/AGCM.rc.tmpl b/AGCM.rc.tmpl
index 0f1f7d2..7fd030a 100644
--- a/AGCM.rc.tmpl
+++ b/AGCM.rc.tmpl
@@ -37,6 +37,8 @@ IRRADAvrg: 0
 # UNCOMMENT to disable aerosol activation in 1-moment cloud microphysics
 #USE_AEROSOL_NN: 0

+NEW_FEATURE: .TRUE.
+
 ###########################################################
 # Flag for definition of the convection scheme
 # The options are RAS or GF
─────────────────────────────────────────────────────────────────────────────────────────

Staging changes

Now you enter into one of the differences in mepo compared to git. In pure git you run git add to stage changes. In mepo, you run mepo stage <repo> <repo>... to stage them:

$ mepo stage GEOSgcm_GridComp GEOSgcm_App
+ GEOSgcm_GridComp: GEOS_GcmGridComp.F90
+ GEOSgcm_App: AGCM.rc.tmpl

Now if you run mepo status again:

$ mepo status
Checking status...
GEOSgcm                | (t) v10.16.3 (DH)
env                    | (t) v3.0.1 (DH)
cmake                  | (t) v3.2.1 (DH)
ecbuild                | (t) geos/v1.0.5 (DH)
NCEP_Shared            | (t) v1.0.1 (DH)
GMAO_Shared            | (t) v1.3.3 (DH)
MAPL                   | (t) v2.3.4 (DH)
FMS                    | (t) geos/2019.01.02+noaff.1 (DH)
GEOSgcm_GridComp       | (b) feature/mathomp4/#345-awesome-new-feature
   | GEOS_GcmGridComp.F90: modified, staged
FVdycoreCubed_GridComp | (t) v1.2.5 (DH)
fvdycore               | (t) geos/v1.1.3 (DH)
GEOSchem_GridComp      | (t) v1.4.1 (DH)
mom                    | (t) geos/5.1.0+1.1.1 (DH)
mom6                   | (t) geos/v1.1.0 (DH)
GEOSgcm_App            | (b) feature/mathomp4/#345-awesome-new-feature
   | AGCM.rc.tmpl: modified, staged
UMD_Etc                | (t) v1.0.3 (DH)
CPLFCST_Etc            | (t) v1.0.1 (DH)

To stage untracked files as well in the specified components, run

$ mepo stage --untracked GEOSgcm_GridComp GEOSgcm_App

A requirement for staging is that the branch is not in a 'detached head' state.

πŸ‘‰ mepo unstage will unstage changes in all components
πŸ‘‰ mepo unstage <component> unstages changes in the specified component

Committing changes

Now you have staged your changes, you need to commit them. This is done with mepo commit: Commit changes

$ mepo commit -m "Closes #345. Commit awesome new feature" GEOSgcm_GridComp GEOSgcm_App
+ GEOSgcm_GridComp: GEOS_GcmGridComp.F90
+ GEOSgcm_App: AGCM.rc.tmpl

or you can do:

$ mepo commit GEOSgcm_GridComp GEOSgcm_App

and it will open $EDITOR for you to write a commit message for all repos.

Closing Keywords

Note that adding Closes #<issue-number>. is keyword phrase that tells GitHub that when this branch is finally merged, the issue can be automatically closed.

Now another status:

$ mepo status
Checking status...
GEOSgcm                | (t) v10.16.3 (DH)
env                    | (t) v3.0.1 (DH)
cmake                  | (t) v3.2.1 (DH)
ecbuild                | (t) geos/v1.0.5 (DH)
NCEP_Shared            | (t) v1.0.1 (DH)
GMAO_Shared            | (t) v1.3.3 (DH)
MAPL                   | (t) v2.3.4 (DH)
FMS                    | (t) geos/2019.01.02+noaff.1 (DH)
GEOSgcm_GridComp       | (b) feature/mathomp4/#345-awesome-new-feature
FVdycoreCubed_GridComp | (t) v1.2.5 (DH)
fvdycore               | (t) geos/v1.1.3 (DH)
GEOSchem_GridComp      | (t) v1.4.1 (DH)
mom                    | (t) geos/5.1.0+1.1.1 (DH)
mom6                   | (t) geos/v1.1.0 (DH)
GEOSgcm_App            | (b) feature/mathomp4/#345-awesome-new-feature
UMD_Etc                | (t) v1.0.3 (DH)
CPLFCST_Etc            | (t) v1.0.1 (DH)

shows that git doesn't see any uncommitted or unstaged changes.

Pushing changes

Now you can push your changes to GitHub with:

$ mepo push GEOSgcm_GridComp GEOSgcm_App
----------
Pushed: GEOSgcm_GridComp
----------
Branch 'feature/mathomp4/#345-awesome-new-feature' set up to track remote branch 'feature/mathomp4/#345-awesome-new-feature' from '[email protected]:GEOS-ESM/GEOSgcm_GridComp.git'.
Warning: No xauth data; using fake authentication data for X11 forwarding.
X11 forwarding request failed on channel 0
remote:
remote: Create a pull request for 'feature/mathomp4/#345-awesome-new-feature' on GitHub by visiting:
remote:      https://github.com/GEOS-ESM/GEOSgcm_GridComp/pull/new/feature/mathomp4/#345-awesome-new-feature
remote:
To github.com:GEOS-ESM/GEOSgcm_GridComp.git
 * [new branch]      feature/mathomp4/#345-awesome-new-feature -> feature/mathomp4/#345-awesome-new-feature
----------
Pushed: GEOSgcm_App
----------
Branch 'feature/mathomp4/#345-awesome-new-feature' set up to track remote branch 'feature/mathomp4/#345-awesome-new-feature' from '[email protected]:GEOS-ESM/GEOSgcm_App.git'.
Warning: No xauth data; using fake authentication data for X11 forwarding.
X11 forwarding request failed on channel 0
remote:
remote: Create a pull request for 'feature/mathomp4/#345-awesome-new-feature' on GitHub by visiting:
remote:      https://github.com/GEOS-ESM/GEOSgcm_App/pull/new/feature/mathomp4/#345-awesome-new-feature
remote:
To github.com:GEOS-ESM/GEOSgcm_App.git
 * [new branch]      feature/mathomp4/#345-awesome-new-feature -> feature/mathomp4/#345-awesome-new-feature

Merging with development

If your science development takes a while, it might be necessary to sync up with the develop branches on GitHub so that a good PR can be made. To do this, we'll first checkout the develop branches using mepo develop:

$ mepo develop GEOSgcm_GridComp GEOSgcm_App

mepo develop looks inside of components.yaml to find the branch specified for development. For these two repos, that is develop. then go back to our feature branches:

$ mepo checkout feature/<username>/feature-dev GEOSgcm_GridComp GEOSgcm_App

Merge changes in the develop branches into feature branches. This step needs to be done manually, using git commands.

For our example:

$ git merge <develop-branch>

Depending on how long your work has taken, this might cause conflicts that are often easily solved. If you have issues, contact the SI Team. After that, you can re-push your branches (after testing!):

$ mepo push GEOSgcm_GridComp GEOSgcm_App

though you might need to do more staging, commits, etc.

Finally, go to the remote location (GitHub or similar) of each component and issue pull/merge requests. This step of issuing pull/merge requests can probably be handled by mepo at some point.

Save state

mepo works by reading a list of components (default: components.yaml) and saving it as an internal state. All subsequent mepo commands work off that saved state. At any point during the development stage, the current state can be saved to both a new internal state and a config file by issuing the command

$ mepo save
Components written to '/Users/mathomp4/Temp/v10.16.3/components-new.yaml'

mepo save requires that each modified component is on a branch that is synced with remote. A mepo save, without a file name, creates components-new.yaml, in the directory where the command was run. A new config file name can be specified via mepo save <new-confile-file.yaml>.