Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clone, build, and run C48_ATM and C48_S2SW on Gaea C5 and C6 #3106

Open
wants to merge 35 commits into
base: develop
Choose a base branch
from

Conversation

DavidBurrows-NCO
Copy link
Contributor

@DavidBurrows-NCO DavidBurrows-NCO commented Nov 15, 2024

Description

What:
Correct build/run for C48_ATM and C48_S2SW on Gaea C5. Add build and run capability for C48_ATM, C48_S2SW, and C96_atm3DVar on Gaea C6.
Why:
After the C5 OS upgrade, submodules no longer built in the global-workflow. This PR correct that and adds build/run capability to C6.

Resolves #3011
Depends on:
ufs-community/ufs-weather-model#2448
ufs-community/UFS_UTILS#995
NOAA-EMC/gfs-utils#87
NOAA-EMC/UPP#1070
NOAA-EMC/GSI#800
NOAA-EMC/GSI-utils#55
NOAA-EMC/GSI-Monitor#146
NOAA-EMC/GDASApp#1361

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)

Change characteristics

How has this been tested?

C5 and C6: clone, built, and ran C48_ATM and C48_S2SW successfully.
C96_atm3DVar is hanging in sfcanl jobs.

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary

@DavidBurrows-NCO
Copy link
Contributor Author

Hi @aerorahul @WalterKolczynski-NOAA We're still waiting on build merges for some submodules, so I've left this PR in draft. From our conversation Tuesday, I've pointed the submodules that were merged to their respective head of develop and the others to my commit for now. Should I be pointing to my submodule commits instead to limit the number of changes coming into GW? Thanks

@jswhit
Copy link
Contributor

jswhit commented Nov 15, 2024

sorc/build_all.sh needs the following update:

--- sorc/build_all.sh
+++ sorc/build_all.sh
@@ -149,7 +149,7 @@ build_opts["ww3prepost"]="${_wave_opt} ${_verbose_opt} ${_build_ufs_opt} ${_buil

 # Optional DA builds
 if [[ "${_build_ufsda}" == "YES" ]]; then
-   if [[ "${MACHINE_ID}" != "orion" && "${MACHINE_ID}" != "hera" && "${MACHINE_ID}" != "hercules" && "${MACHINE_ID}" != "wcoss2" && "${MACHINE_ID}" != "noaacloud" && "${MACHINE_ID}" != "gaea" ]]; then
+   if [[ "${MACHINE_ID}" != "orion" && "${MACHINE_ID}" != "hera" && "${MACHINE_ID}" != "hercules" && "${MACHINE_ID}" != "wcoss2" && "${MACHINE_ID}" != "noaacloud" && "${MACHINE_ID}" != "gaeac5" && "${MACHINE_ID}" != "gaeac6" ]]; then
       echo "NOTE: The GDAS App is not supported on ${MACHINE_ID}.  Disabling build."
    else
       build_jobs["gdas"]=8

@jswhit
Copy link
Contributor

jswhit commented Nov 15, 2024

also ush/load_ufsda_modules.sh needs

--- a/ush/load_ufsda_modules.sh
+++ b/ush/load_ufsda_modules.sh
@@ -34,13 +34,13 @@ source "${HOMEgfs}/ush/module-setup.sh"
 module use "${HOMEgfs}/sorc/gdas.cd/modulefiles"

 case "${MACHINE_ID}" in
-  ("hera" | "orion" | "hercules" | "wcoss2")
+  ("hera" | "orion" | "hercules" | "gaeac5" | "gaeac6" | "wcoss2")
     module load "${MODS}/${MACHINE_ID}"
     ncdump=$( command -v ncdump )
     NETCDF=$( echo "${ncdump}" | cut -d " " -f 3 )
     export NETCDF
     ;;
-  ("jet" | "gaea" | "s4" | "acorn")
+  ("jet" | "s4" | "acorn")
     echo WARNING: UFSDA NOT SUPPORTED ON THIS PLATFORM
     ;;
   *)

@DavidBurrows-NCO
Copy link
Contributor Author

also ush/load_ufsda_modules.sh needs

Thanks @jswhit I pushed changes to ush/load_ufsda_modules.sh and sorc/build_all.sh

@jswhit
Copy link
Contributor

jswhit commented Nov 15, 2024

also...

workflow/hosts/gaeac6.yaml and gaeac5.yaml:

-QUEUE_SERVICE: normal
+QUEUE_SERVICE: hpss
 PARTITION_BATCH: batch
-PARTITION_SERVICE: batch
+PARTITION_SERVICE: dtn_f5_f6

and modulefiles/module_gwsetup.gaeac6.lua:

-prepend_path("MODULEPATH", "/ncrc/proj/epic/spack-stack/spack-stack-1.6.0/envs/unified-env/install/modulefiles/Core")
+prepend_path("MODULEPATH", "/ncrc/proj/epic/spack-stack/c6/spack-stack-1.6.0/envs/unified-env/install/modulefiles/Core")

@jswhit
Copy link
Contributor

jswhit commented Nov 15, 2024

env/GAEAC5.env and GAEAC6.env seem to be missing a bunch of stuff. I just copied HERCULES.env for both, and made some minor mods (see https://github.com/jswhit2/global-workflow/blob/develop/env/GAEAC5.env)

@jswhit
Copy link
Contributor

jswhit commented Nov 15, 2024

build_ww3prepost is failing for me on both c5 and c6 (using ufs-wx-model 2448)

@DavidBurrows-NCO
Copy link
Contributor Author

missing a bunch of stuff

@jswhit It's not really missing but intentionally minimized at the request of EMC porting to a new machine. Instead, we started from a nearly blank canvas and have been building up. Currently, the C5 and C6.env files are set up for C48_ATM, C48_S2SW, and C96_atm3DVar jobs. The 3DVarAOWCDA configuration you're running will definitely have some additional jobs. If you send those particular job names (or "step" in the env file). I will add them to the files.

@JessicaMeixner-NOAA
Copy link
Contributor

build_ww3prepost is failing for me on both c5 and c6 (using ufs-wx-model 2448)

@jswhit - can you point me to a log file? Maybe I can look and see if something is easy to fix with this.

@jswhit
Copy link
Contributor

jswhit commented Nov 18, 2024

build_ww3prepost is failing for me on both c5 and c6 (using ufs-wx-model 2448)

@jswhit - can you point me to a log file? Maybe I can look and see if something is easy to fix with this.

@JessicaMeixner-NOAA here is the error:

gpfs/f6/ira-da/proj-shared/Jeffrey.S.Whitaker/global-workflow-jswhit2/sorc/ufs_model.fd/WW3/model/src/w3initmd.F90(451): error #7002: Error in opening the compiled module file.  Check INCLUDE paths.   [WAV_RESTART_MOD]    use wav_restart_mod, only : read_restart
--------^
/gpfs/f6/ira-da/proj-shared/Jeffrey.S.Whitaker/global-workflow-jswhit2/sorc/ufs_model.fd/WW3/model/src/w3initmd.F90(975): error #6632: Keyword arguments are invalid without an explicit interface.   [VA]            call read_restart(trim(fname), va=va, mapsta=mapsta, mapst2=mapst2)
-------------------------------------------^
/gpfs/f6/ira-da/proj-shared/Jeffrey.S.Whitaker/global-workflow-jswhit2/sorc/ufs_model.fd/WW3/model/src/w3initmd.F90(975): error #6632: Keyword arguments are invalid without an explicit interface.   [MAPSTA]
            call read_restart(trim(fname), va=va, mapsta=mapsta, mapst2=mapst2)
--------------------------------------------------^
/gpfs/f6/ira-da/proj-shared/Jeffrey.S.Whitaker/global-workflow-jswhit2/sorc/ufs_model.fd/WW3/model/src/w3initmd.F90(975): error #6632: Keyword arguments are invalid without an explicit interface.   [MAPST2]
            call read_restart(trim(fname), va=va, mapsta=mapsta, mapst2=mapst2)
-----------------------------------------------------------------^
/gpfs/f6/ira-da/proj-shared/Jeffrey.S.Whitaker/global-workflow-jswhit2/sorc/ufs_model.fd/WW3/model/src/w3initmd.F90(451): error #6580: Name in only-list does not exist or is not accessible.   [READ_RESTART]
    use wav_restart_mod, only : read_restart
--------------------------------^
compilation aborted for /gpfs/f6/ira-da/proj-shared/Jeffrey.S.Whitaker/global-workflow-jswhit2/sorc/ufs_model.fd/WW3/model/src/w3initmd.F90 (code 1)

@JessicaMeixner-NOAA
Copy link
Contributor

@jswhit - Okay I know what the issue is, but it'll take a minute to get it fixed. The issue crept in with ufs-community/ufs-weather-model#2445 and we didn't catch it. If you go back one-commit of ufs-waether-model, hopefully things will run. We'll get a fix in as soon as possible.

@jswhit
Copy link
Contributor

jswhit commented Nov 19, 2024

@JessicaMeixner-NOAA I'm seeing this error in the gdas_fcst step on c6 when I run with ufs-wx-model 2448

424:  (abort_ice)ABORTED:
424:  (abort_ice) error =
424:  (construct_filename) ERROR: history filename already used for another history s
424:  tream iceh_inst.2021-03-24-10800.nc

and the traceback looks like this

473: ufs_model.x        0000000005E9CD8B  ice_broadcast_mp_         252  ice_broadcast.F90
473: ufs_model.x        0000000005F055E3  ice_history_write         169  ice_history_write.F90
473: ufs_model.x        0000000005C2A4E2  ice_history_mp_ac        4134  ice_history.F90
473: ufs_model.x        0000000005EE77FC  cice_runmod_mp_ci         367  CICE_RunMod.F90
473: ufs_model.x        0000000005B7DA06  ice_comp_nuopc_mp        1204  ice_comp_nuopc.F90
473: ufs_model.x        0000000000D05438  Unknown               Unknown  Unknown

Do you know of any recenter cice changes that could cause this?

@JessicaMeixner-NOAA
Copy link
Contributor

I don't know but I'm not as caught up on all the recent ufs wm changes as I normally am, but taking a quick look at ufs-weather-model says CICE hasn't been updated in 2 months.

@jswhit
Copy link
Contributor

jswhit commented Nov 19, 2024

For some more context on the cice error, from ice_diag.d:

(ice_comp_nuopc):(ModelAdvance) cice istep, nextsw_cday =         15      0.83111111111111D+02
 (ice_pio_init) create file ./CICE_OUTPUT/iceh_inst.2021-03-24-09600.nc

 Finished writing ./CICE_OUTPUT/iceh_inst.2021-03-24-09600.nc
(ice_comp_nuopc):(ModelAdvance) cice istep, nextsw_cday =         16      0.83118055555556D+02
 (ice_pio_init) create file ./CICE_OUTPUT/iceh_inst.2021-03-24-10200.nc

 Finished writing ./CICE_OUTPUT/iceh_inst.2021-03-24-10200.nc
(ice_comp_nuopc):(ModelAdvance) cice istep, nextsw_cday =         17      0.83125000000000D+02
 (ice_pio_init) create file ./CICE_OUTPUT/iceh_inst.2021-03-24-10800.nc

 Finished writing ./CICE_OUTPUT/iceh_inst.2021-03-24-10800.nc
 (construct_filename) history stream =            4
 (construct_filename) history filename = iceh_inst.2021-03-24-10800.nc
 (construct_filename) filename in use for stream            3
 (construct_filename) filename for stream iceh_inst.2021-03-24-10800.nc
 (construct_filename) Use namelist hist_suffix so history filenames are unique

@jswhit2 jswhit2 mentioned this pull request Nov 21, 2024
10 tasks
@jswhit2
Copy link
Contributor

jswhit2 commented Nov 21, 2024

The problem with the ice model (and a potential fix) are documented in PR #3121

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C96_atm3DVar FAILED on Hera in Build# 2 in
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/EXPDIR/C96_atm3DVar_73cc6bf4

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C48mx500_3DVarAOWCDA FAILED on Hera in Build# 2 in
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/EXPDIR/C48mx500_3DVarAOWCDA_73cc6bf4

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C48_S2SW FAILED on Hera in Build# 2 in
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/EXPDIR/C48_S2SW_73cc6bf4

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C48_ATM FAILED on Hera in Build# 2 in
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/EXPDIR/C48_ATM_73cc6bf4

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C48mx500_hybAOWCDA FAILED on Hera in Build# 2 with error logs:

/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48mx500_hybAOWCDA_73cc6bf4/logs/2021032418/enkfgdas_fcst_mem001.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48mx500_hybAOWCDA_73cc6bf4/logs/2021032418/enkfgdas_fcst_mem002.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48mx500_hybAOWCDA_73cc6bf4/logs/2021032418/gdas_fcst_seg0.log

Follow link here to view the contents of the above file(s): (link)

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C96C48_hybatmDA FAILED on Hera in Build# 2 in
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/EXPDIR/C96C48_hybatmDA_73cc6bf4

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C96C48_ufs_hybatmDA FAILED on Hera in Build# 2 in
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/EXPDIR/C96C48_ufs_hybatmDA_73cc6bf4

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C96C48_hybatmaerosnowDA FAILED on Hera in Build# 2 in
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/EXPDIR/C96C48_hybatmaerosnowDA_73cc6bf4

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C96_S2SWA_gefs_replay_ics FAILED on Hera in Build# 2 in
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/EXPDIR/C96_S2SWA_gefs_replay_ics_73cc6bf4

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C48mx500_hybAOWCDA FAILED on Hera in Build# 2 in
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/EXPDIR/C48mx500_hybAOWCDA_73cc6bf4

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C48_S2SWA_gefs FAILED on Hera in Build# 2 with error logs:

/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48_S2SWA_gefs_73cc6bf4/logs/2021032312/gefs_fcst_mem000_seg0.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48_S2SWA_gefs_73cc6bf4/logs/2021032312/gefs_fcst_mem001_seg0.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48_S2SWA_gefs_73cc6bf4/logs/2021032312/gefs_fcst_mem002_seg0.log

Follow link here to view the contents of the above file(s): (link)

@emcbot
Copy link

emcbot commented Dec 26, 2024

Experiment C48_S2SWA_gefs FAILED on Hera in Build# 2 in
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/EXPDIR/C48_S2SWA_gefs_73cc6bf4

@emcbot emcbot added CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed labels Dec 26, 2024
@emcbot
Copy link

emcbot commented Dec 26, 2024

CI Failed on Hera in Build# 2
Built and ran in directory /scratch1/NCEPDEV/global/CI/3106


Experiment C48mx500_3DVarAOWCDA_73cc6bf4 Terminated with 0
FAIL
FAIL tasks failed and 1 dead at Thu Dec 26 15:58:35 UTC 2024
Experiment C48mx500_3DVarAOWCDA_73cc6bf4 Terminated: *FAIL*
Experiment C48mx500_hybAOWCDA_73cc6bf4 Terminated with 0
FAIL
FAIL tasks failed and 3 dead at Thu Dec 26 15:58:36 UTC 2024
Experiment C48mx500_hybAOWCDA_73cc6bf4 Terminated: *FAIL*
Experiment C48_ATM_73cc6bf4 Terminated with 0
FAIL
FAIL tasks failed and 1 dead at Thu Dec 26 15:58:36 UTC 2024
Experiment C48_ATM_73cc6bf4 Terminated: *FAIL*
Experiment C96_atm3DVar_73cc6bf4 Terminated with 0
FAIL
FAIL tasks failed and 1 dead at Thu Dec 26 15:58:36 UTC 2024
Experiment C96_atm3DVar_73cc6bf4 Terminated: *FAIL*
Experiment C96C48_hybatmDA_73cc6bf4 Terminated with 0
FAIL
FAIL tasks failed and 3 dead at Thu Dec 26 15:58:36 UTC 2024
Experiment C96C48_hybatmDA_73cc6bf4 Terminated: *FAIL*
Experiment C96C48_ufs_hybatmDA_73cc6bf4 Terminated with 0
FAIL
FAIL tasks failed and 3 dead at Thu Dec 26 15:58:37 UTC 2024
Experiment C96C48_ufs_hybatmDA_73cc6bf4 Terminated: *FAIL*
Experiment C96C48_hybatmaerosnowDA_73cc6bf4 Terminated with 0
FAIL
FAIL tasks failed and 3 dead at Thu Dec 26 15:58:37 UTC 2024
Experiment C96C48_hybatmaerosnowDA_73cc6bf4 Terminated: *FAIL*
Experiment C48_S2SW_73cc6bf4 Terminated with 0
FAIL
FAIL tasks failed and 1 dead at Thu Dec 26 15:58:37 UTC 2024
Experiment C48_S2SW_73cc6bf4 Terminated: *FAIL*
Experiment C96_S2SWA_gefs_replay_ics_73cc6bf4 Terminated with 0
FAIL
FAIL tasks failed and 3 dead at Thu Dec 26 15:58:38 UTC 2024
Experiment C96_S2SWA_gefs_replay_ics_73cc6bf4 Terminated: *FAIL*
Error logs:
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48mx500_hybAOWCDA_73cc6bf4/logs/2021032418/enkfgdas_fcst_mem001.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48mx500_hybAOWCDA_73cc6bf4/logs/2021032418/enkfgdas_fcst_mem002.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48mx500_hybAOWCDA_73cc6bf4/logs/2021032418/gdas_fcst_seg0.log
Error logs:
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48mx500_3DVarAOWCDA_73cc6bf4/logs/2021032418/gdas_fcst_seg0.log
Error logs:
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96C48_hybatmDA_73cc6bf4/logs/2021122018/enkfgdas_fcst_mem001.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96C48_hybatmDA_73cc6bf4/logs/2021122018/enkfgdas_fcst_mem002.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96C48_hybatmDA_73cc6bf4/logs/2021122018/gdas_fcst_seg0.log
Error logs:
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48_ATM_73cc6bf4/logs/2021032312/gfs_fcst_seg0.log
Error logs:
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96_atm3DVar_73cc6bf4/logs/2021122018/gdas_fcst_seg0.log
Error logs:
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_73cc6bf4/logs/2024022318/enkfgdas_fcst_mem001.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_73cc6bf4/logs/2024022318/enkfgdas_fcst_mem002.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_73cc6bf4/logs/2024022318/gdas_fcst_seg0.log
Error logs:
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96C48_hybatmaerosnowDA_73cc6bf4/logs/2021122012/enkfgdas_fcst_mem001.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96C48_hybatmaerosnowDA_73cc6bf4/logs/2021122012/enkfgdas_fcst_mem002.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96C48_hybatmaerosnowDA_73cc6bf4/logs/2021122012/gdas_fcst_seg0.log
Error logs:
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48_S2SW_73cc6bf4/logs/2021032312/gfs_fcst_seg0.log
Error logs:
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96_S2SWA_gefs_replay_ics_73cc6bf4/logs/2020110100/gefs_fcst_mem000_seg0.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96_S2SWA_gefs_replay_ics_73cc6bf4/logs/2020110100/gefs_fcst_mem001_seg0.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C96_S2SWA_gefs_replay_ics_73cc6bf4/logs/2020110100/gefs_fcst_mem002_seg0.log
Experiment C48_S2SWA_gefs_73cc6bf4 Terminated with 0
FAIL
FAIL tasks failed and 3 dead at Thu Dec 26 15:58:56 UTC 2024
Experiment C48_S2SWA_gefs_73cc6bf4 Terminated: *FAIL*
Error logs:
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48_S2SWA_gefs_73cc6bf4/logs/2021032312/gefs_fcst_mem000_seg0.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48_S2SWA_gefs_73cc6bf4/logs/2021032312/gefs_fcst_mem001_seg0.log
/scratch1/NCEPDEV/global/CI/3106/RUNTESTS/COMROOT/C48_S2SWA_gefs_73cc6bf4/logs/2021032312/gefs_fcst_mem002_seg0.log

@aerorahul
Copy link
Contributor

@DavidBurrows-NCO When you get a chance, please resolve the conflicts and we can run this through the CI again. I think the conflicts are likely due to the Rocky 8 PR for CSP's from @weihuang-jedi #2998

@DavidBurrows-NCO
Copy link
Contributor Author

Morning @aerorahul @WalterKolczynski-NOAA ...I updated my branch and tested C48_S2SW. I receive the same error that the CI tests are receiving..The error is from the forecast job:
/scratch1/NCEPDEV/global/CI/3106/global-workflow/ush/atparse.bash: line 82: RESTART_FH: unbound variable.
A ufs-weather-model PR2419 was merged 3 weeks ago that added this line:
restart_fh: @[RESTART_FH]
to sorc/ufs_model.fd/tests/parm/model_configure.IN.

The issue is that currently global-workflow is pointing to an older hash than PR2419.
I’m currently pointing to the hash associated with Gaea C6 upgrades from PR2448 (which is 2 behind the ufs-wx-model head currently). How should I proceed with resolving this? Thanks.

@JessicaMeixner-NOAA
Copy link
Contributor

@DavidBurrows-NCO you can simply set RESTART_FH=' ' see: ufs-community/ufs-weather-model#2419

I am updating things further for this variable in the PR here: #3190 but just simply setting it to empty should get you past your issues.

@DavidBurrows-NCO
Copy link
Contributor Author

I am updating things further for this variable in the PR here: #3190 but just simply setting it to empty should get you past your issues.

Thanks @JessicaMeixner-NOAA. I did a quick test and that did get me past the issue.
@aerorahul @WalterKolczynski-NOAA I suspect we will add this variable to the config.fcst files. I tested by adding export RESTART_FH="" to the gfs/config.fcst file to get past this issue (I haven't committed that change yet). Let me know where you think this variable should be set, and which hash I should point the ufs-weather-model to. Thanks!

@aerorahul
Copy link
Contributor

@DavidBurrows-NCO
I am not sure yet. I should look at the ufs-wx-model PR to see what this change implies in the context of the coupled model restarts. The RESTART_FH variable in model.configure sets the restart hours for the atmosphere. We will need to ensure this is consistent w/ the other components; ocean, ice, waves, etc.

@jswhit
Copy link
Contributor

jswhit commented Jan 2, 2025

The GSI modulefiles for gaeac5 and gaeac6 are now pointing to /gpfs/f6/bil-fire8/world-shared/GSI_data/fix/gsi/20241022 (on c6) and /gpfs/f5/ufs-ard/world-shared/GSI_data/fix/gsi/20241022 (on c5), which don't exist - this causes the build to fail with GSI develop. Currently only the 20240208 version of the fix files is available - can the 20241022 files be copied to c5 and c6 as well?

@DavidBurrows-NCO
Copy link
Contributor Author

The GSI modulefiles for gaeac5 and gaeac6 are now pointing to /gpfs/f6/bil-fire8/world-shared/GSI_data/fix/gsi/20241022 (on c6) and /gpfs/f5/ufs-ard/world-shared/GSI_data/fix/gsi/20241022 (on c5), which don't exist - this causes the build to fail with GSI develop. Currently only the 20240208 version of the fix files is available - can the 20241022 files be copied to c5 and c6 as well?

@jswhit Done.

@jswhit
Copy link
Contributor

jswhit commented Jan 3, 2025

The GSI modulefiles for gaeac5 and gaeac6 are now pointing to /gpfs/f6/bil-fire8/world-shared/GSI_data/fix/gsi/20241022 (on c6) and /gpfs/f5/ufs-ard/world-shared/GSI_data/fix/gsi/20241022 (on c5), which don't exist - this causes the build to fail with GSI develop. Currently only the 20240208 version of the fix files is available - can the 20241022 files be copied to c5 and c6 as well?

@jswhit Done.

Thanks @DavidBurrows-NCO !

@JessicaMeixner-NOAA
Copy link
Contributor

@DavidBurrows-NCO I'm not sure where @aerorahul wants this variable to go. In the PR #3190 I put it in ush/parsing_model_configure_FV3.sh because I want to change the way we are setting it. However, to obtain same answers all you need to do is set it to "" somewhere. Linking to the related conversation: https://github.com/NOAA-EMC/global-workflow/pull/3190/files#r1900169443 as we might be moving where this is located in that PR too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed CI-Hercules-Failed **Bot use only** CI testing on Hercules for this PR has failed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GW submodules no longer building on Gaea-C5 after OS upgrade; Also add Gaea-C6 build
8 participants