Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compile and platform yamls to compile model with FRE/2025.01 #131

Merged
merged 5 commits into from
Jan 16, 2025

Conversation

uwagura
Copy link
Collaborator

@uwagura uwagura commented Jan 8, 2025

This PR adds compile and platform yamls to allow users to compile the NWA configuration with Fre/2025.01, and optionally produce containers to run the model as well. Both the containers and the bare metal models compile with these yamls can then be run using fre/bronx-23.

The full containerized does not quite reproduce the bare-metal run results, but the differences mostly seem minimal:

sst_eval_nwa_container
mean_ssh_eval_nwa_container
sst_trends_nwa_container
domain_chl_nwa_container
regional_chl_nwa_container

@uwagura uwagura changed the title Add compile and platform yamls, modify experiment yaml to work with them Add compile and platform yamls to compile model with FRE/2025.01 Jan 8, 2025
@yichengt900
Copy link
Contributor

@uwagura, thanks for opening this PR—awesome work! Although it is still a draft, I have two quick thoughts:

Would you mind adding a README file under yaml with brief instructions on how to perform bare-metal/container builds and how to run containerized builds using bronx-23?

When you get a chance, could you also share the archive paths for both your bare-metal and containerized NWA-12 runs?
Thanks!

@uwagura uwagura marked this pull request as ready for review January 10, 2025 15:26
@uwagura
Copy link
Collaborator Author

uwagura commented Jan 10, 2025

@yichengt900 , I've added a README to the yaml directory, and added instructions to the main yaml on how to use fre make. Let me know what you think.

You can find the history files and the pp files from this experiment here: /archive/Utheri.Wagura/fre/cefi/NWA/FullContainerRun/CEFI_NWA12_COBALT_V1/gfdl.ncrc5-intel23-prod/

@andrew-c-ross , you don't have to try the containerized run if you don't have the time, but I'd love to get your thoughts on some of the plots!

yamls/NWA12/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@yichengt900 yichengt900 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uwagura, thank you for opening this PR. Overall, it looks good, and I only have a few comments regarding your instructions and some other minor things.

I successfully built the containerized model by following the modified instructions with fre/2025.01. Additionally, I conducted several 2-day regression test runs using both the bare-metal build and the containerized build, and the restarts were identical.

Furthermore, I compared the model results from your fully containerized run with my 2024/12 retrospective runs. Based on my analysis, the model results also appear to be identical. Relevant figures can be found here:

2024/12 bare-metal runs: /nbhome/ynt/results/CEFI_NWA12_COBALT_V1_repro/analysis/regional-mom6-diagnostics_20240418/physics/figures_2024_12

Fully containerized runs: /nbhome/ynt/results/CEFI_NWA12_COBALT_V1_repro/analysis/regional-mom6-diagnostics_20240418/physics/figures_container

Given these findings, I’m unsure what you meant by your original comment in your PR: The fully containerized run does not quite reproduce the bare-metal run results, but the differences mostly seem minimal. Are you referring to differences compared to the original NWA paper results?

@uwagura
Copy link
Collaborator Author

uwagura commented Jan 13, 2025

Thanks for your review @yichengt900 , I'll make the changes you suggested soon.

Regarding my comment, about the results not being quite the same, I meant that the results from the containerized run were not quite identical to the plots I produced with the data in /archive/acr/fre/NWA/2023_04/NWA12_COBALT_2023_04_kpo4-coastatten-physics/gfdl.ncrc5-intel22-prod. For example, compare the above sst plot with the plot in #74 , or thessh plot with the one in #101. None of the differences seem particularly big or concerning, just thought that I'd mention them.

@yichengt900
Copy link
Contributor

Thanks for your review @yichengt900 , I'll make the changes you suggested soon.

Regarding my comment, about the results not being quite the same, I meant that the results from the containerized run were not quite identical to the plots I produced with the data in /archive/acr/fre/NWA/2023_04/NWA12_COBALT_2023_04_kpo4-coastatten-physics/gfdl.ncrc5-intel22-prod. For example, compare the above sst plot with the plot in #74 , or thessh plot with the one in #101. None of the differences seem particularly big or concerning, just thought that I'd mention them.

@uwagura, thank you for the clarification. It seems the data you pointed to is likely the same one Andrew used for his paper. Unfortunately, it’s probably not possible to fully reproduce those results due to updates in libraries and compilers on C5, as well as changes in the model codes.

Our main goal here is to confirm that the containerized version produces the same results as the bare-metal build. One good thing is even with those differences, the new model results are consistent and not worse than the ones in the paper.

@uwagura uwagura requested a review from yichengt900 January 14, 2025 21:06
@@ -68,45 +75,45 @@ frerun -x CEFI_NWA12_cobalt.xml -p ncrc5.intel23 -t prod CEFI_NWA12_COBALT_V1 --
Like `fre make`, postprocessing is split into several subcommands to improve modularity. On PPAN, load the appropriate module:
```
module load fre/2025.01
```
``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing ` ?

Copy link
Contributor

@yichengt900 yichengt900 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uwagura, everything looks good now except for a missing `. I will approve it after you change that. Thanks again for your contribution

@yichengt900 yichengt900 merged commit 23b3062 into main Jan 16, 2025
3 checks passed
@yichengt900 yichengt900 deleted the feature/new_fre_make branch January 17, 2025 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants