Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add git-hash information to EAMxx and in the output metadata #3094

Merged

Conversation

AaronDonahue
Copy link
Contributor

This commit changes the CMake commands for eamxx to ensure that the git-hash is saved to the EAMXX_GIT_VERSION variable and is thus available to EAMxx itself.

This naturally causes the git-hash to be added to the netCDF metadata for all EAMxx output.

Partially addresses #2142

@AaronDonahue
Copy link
Contributor Author

I'm not CMake savvy so I wouldn't be surprised if there is a better way to accomplish this. @jgfouca or @bartgol , feel free to comment on this approach.

The main goal is to have the EAMXX_GIT_VERSION variable always be defined in EAMxx. We already use this variable in the metadata of our output, but it is "" in all of the examples I have checked.

@rljacob
Copy link
Member

rljacob commented Nov 5, 2024

Is this for standalone? The coupler has the git hash in the infobuffer and passes it to all the models.

@AaronDonahue
Copy link
Contributor Author

This is for CIME cases. If there is already a GIT hash variable available then that would be the correct way to grab the info I think. @rljacob can you point me to an example of this being done?

jgfouca
jgfouca previously approved these changes Nov 5, 2024
@bartgol
Copy link
Contributor

bartgol commented Nov 5, 2024

@rljacob any reason to believe the info we get with rev-parse would differ from what CIME would pass?

I suppose there is a corner case, but it should not affect most users: the logic we have in our cmake code only runs when CMake runs. If one adds another commit and rebuilds, unless the changes trigger another cmake run, EAMXX_GIT_SHA will not get updated.

That said, I think the corner case should not impact any production run, only development. So I am ok with it. Unless Rob sees a reason for which we SHOULD rely on CIME's passed var.

@rljacob
Copy link
Member

rljacob commented Nov 5, 2024

cime's passed var is output in all the other component's history files so its a way to verify output from different models were all made by the same version.

In this line:
https://github.com/E3SM-Project/E3SM/blob/67ae4d5d770e97ecf689aae239500de428d7153b/components/eamxx/src/mct_coupling/atm_comp_mct.F90#L151
Add an argument: model_version=version where "version" is a local character variable that will have the hash after that call.

@AaronDonahue
Copy link
Contributor Author

Thanks for pointing me to the proper code @rljacob . I'll make the change.

@AaronDonahue
Copy link
Contributor Author

Question for @rljacob , @jgfouca and @bartgol . Following Rob's example for EAMxx it looks like I get a different git-hash then if I use the CMake approach. I think the CMake approach will grab whatever the git-hash is for the current run, meaning it will change with new commits to a branch. On the other hand I'm not sure how the version is determined. On this branch when I looked at what I get it is d291456767 which tracks to this commit: d291456 .

My question is about what we want in our metadata:

  1. We can follow EAM convention, use version from the component coupler and call it git_version in the metadata.
  2. We can have two fields, git_version as above following EAM convention and a separate one that is maybe called git_branch_version that takes the CMake value.

I think that regardless we want to use the EAM convention so that the metadata matches what the other components are using. The main question is, do we also want to add the most up-to-date git hash to the meta data too?

@rljacob
Copy link
Member

rljacob commented Nov 5, 2024

The hash is what CIME retrieves. And that will be from the SRCROOT. If you've made new commits but haven't rerun preview_namelists (which happens with case.build or case.submit), it won't be updated.

@AaronDonahue
Copy link
Contributor Author

The hash is what CIME retrieves. And that will be from the SRCROOT. If you've made new commits but haven't rerun preview_namelists (which happens with case.build or case.submit), it won't be updated.

That's interesting. I definitely ran case.build and case.submit because I needed to rebuild after making the changes. The commit that it appears to point to isn't even the master branch that this branch was built on. I'm only noting that in case there is some strange behavior with the way CIME is querying that info that is worth a look.

@bartgol
Copy link
Contributor

bartgol commented Nov 6, 2024

If I understand CIME correctly, the sha you get from CIME will always be at least as recent as the one you get in standalone EAMxx. I would use that one, it seems like it should an easy addition to our f90-cxx interface for the coupler.

Copy link
Contributor

mergify bot commented Nov 7, 2024

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce checks passing

Wonderful, this rule succeeded.

Make sure that checks are not failing on the PR, and reviewers approved

  • #approved-reviews-by >= 1
  • #changes-requested-reviews-by == 0
  • any of:
    • all of:
      • check-success="gcc-openmp / dbg"
      • check-success="gcc-openmp / fpe"
      • check-success="gcc-openmp / opt"
      • check-success="gcc-openmp / sp"
    • check-skipped={% raw %}gcc-openmp / ${{ matrix.build_type }}{% endraw %}
  • any of:
    • all of:
      • check-success="gcc-cuda / dbg"
      • check-success="gcc-cuda / opt"
      • check-success="gcc-cuda / sp"
    • check-skipped={% raw %}gcc-cuda / ${{ matrix.build_type }}{% endraw %}
  • any of:
    • all of:
      • check-success="cpu-gcc / ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.scream-small_kernels--scream-output-preset-5"
      • check-success="cpu-gcc / ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.scream-output-preset-2"
      • check-success="cpu-gcc / ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.scream-dpxx-arm97"
      • check-success="cpu-gcc / SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.scream-mam4xx-all_mam4xx_procs"
    • check-skipped={% raw %}cpu-gcc / ${{ matrix.test.short_name }}{% endraw %}
  • any of:
    • check-skipped=cpu-gcc
    • check-success=cpu-gcc

@AaronDonahue
Copy link
Contributor Author

If I understand CIME correctly, the sha you get from CIME will always be at least as recent as the one you get in standalone EAMxx. I would use that one, it seems like it should an easy addition to our f90-cxx interface for the coupler.

Yes, it wasn't difficult to grab the git version from the component coupler and pass that all the way down to our output metadata.

My only remaining question is about the discrepancy I see in the different git hashes between the two approaches. @rljacob I think you told me on Slack you were able to recreate the different hash values, any idea why?

@AaronDonahue AaronDonahue requested a review from jgfouca November 7, 2024 23:08
jgfouca
jgfouca previously approved these changes Nov 7, 2024
else()
set(SCREAM_CIME_BUILD TRUE)
endif()

# Print the sha of the last commit (useful to double check which version was tested on CDash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block should now be moved back into where it was before.

Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Triggering mergify, to check something. But also pointing out that the mods in the main CMakeLists.txt file should be reverted.

@rljacob
Copy link
Member

rljacob commented Nov 8, 2024

I opened a CIME issue about the behavior. ESMCI/cime#4705. The hash is set at create_newcase time and never changes. Its possible CIME always did this and no one noticed until Aaron started looking carefully. For a production run, those will be done on master so the hash will be right. For any runs done on a frequently changing feature branch, most of that data gets thrown away so no one checked.

@AaronDonahue
Copy link
Contributor Author

Triggering mergify, to check something. But also pointing out that the mods in the main CMakeLists.txt file should be reverted.

@jgfouca do you agree? From the perspective of this PR reverting the CMake changes makes sense. We are no longer passing the EAMXX_GIT_VERSION variable to the netCDF metadata. But in the old approach were we ever setting this variable to anything other than ""? In other words should we keep my change so that if EAMXX_GIT_VERSION is ever queried it reflects the git hash? If we plan to only use the CIME passed one can we remove EAMXX_GIT_VERSION altogether?

@bartgol
Copy link
Contributor

bartgol commented Nov 11, 2024

Triggering mergify, to check something. But also pointing out that the mods in the main CMakeLists.txt file should be reverted.

@jgfouca do you agree? From the perspective of this PR reverting the CMake changes makes sense. We are no longer passing the EAMXX_GIT_VERSION variable to the netCDF metadata. But in the old approach were we ever setting this variable to anything other than ""? In other words should we keep my change so that if EAMXX_GIT_VERSION is ever queried it reflects the git hash? If we plan to only use the CIME passed one can we remove EAMXX_GIT_VERSION altogether?

We cannot remove it altogether, since we need it for standalone runs. But is should not be used for CIME runs.

This commit retrieves the git-hash saved in the component coupler and
makes sure it is passed into the atmosphere driver.  This is then used
to improve the netcdf metadata in all EAMxx output files.
@AaronDonahue AaronDonahue force-pushed the aarondonahue/IO/add_githash_to_eamxx_output_meta branch from 0fbd70f to 1062373 Compare November 11, 2024 18:31
@AaronDonahue
Copy link
Contributor Author

@bartgol and @jgfouca , I purged the commit that changed the CMakeLists.txt file. This should be ready for re-review.

@bartgol bartgol force-pushed the master branch 3 times, most recently from 6318f41 to db5b35d Compare November 13, 2024 21:48
@AaronDonahue
Copy link
Contributor Author

@bartgol can I go ahead and merge this?

@AaronDonahue AaronDonahue merged commit 3f99ea0 into master Nov 15, 2024
18 checks passed
@AaronDonahue AaronDonahue deleted the aarondonahue/IO/add_githash_to_eamxx_output_meta branch November 15, 2024 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants