Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MOM6 support (om4 025jra ryf) #258

Merged
merged 46 commits into from
Nov 26, 2024
Merged

Conversation

marc-white
Copy link
Collaborator

Closes #175 .

This PR adds the data requested from #175 , which required a new builder: MOM6Builder.

PR includes relevant builder, translator, and tests.

@marc-white marc-white linked an issue Nov 18, 2024 that may be closed by this pull request
5 tasks
Copy link

codecov bot commented Nov 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.44%. Comparing base (4fb9856) to head (428f0d2).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #258      +/-   ##
==========================================
+ Coverage   97.90%   98.44%   +0.54%     
==========================================
  Files          11       11              
  Lines         811      837      +26     
==========================================
+ Hits          794      824      +30     
+ Misses         17       13       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@charles-turner-1
Copy link
Collaborator

charles-turner-1 commented Nov 18, 2024

All tests passing - just codecov that's not passing.

I guess that must mean that CI environment isn't mirroring Gadi correctly..

@marc-white marc-white marked this pull request as ready for review November 19, 2024 03:46
Copy link
Collaborator

@charles-turner-1 charles-turner-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of minor comments & a bunch of empty files I think got accidentally committed.

Otherwise looks good - the only thing I think that might warrant some extra thought is the EmptyDataError ( inmanager.py) - I've left a comment here, I'm not sure that EmptyDataError is the most appropriate?

columns_with_iterables=COLUMNS_WITH_ITERABLES,
)
except EmptyDataError as e:
raise EmptyDataError(str(e) + f": {self.path}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps an issue for intake-dataframe-catalog rather than here, but I feel like we might want to emit a DfFileCatalogError here rather than than an EmptyDataError?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is an issue for intake-dataframe-catalog, as you suggested. All I was trying to do is re-emit the same error with a slightly more useful message.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened an issue there. Could you update the catch to

except (EmptyDataError, DfFileCatalogError) as e

so it won't break when we update it there?

src/access_nri_intake/source/builders.py Outdated Show resolved Hide resolved
src/access_nri_intake/source/builders.py Outdated Show resolved Hide resolved
src/access_nri_intake/source/builders.py Outdated Show resolved Hide resolved
src/access_nri_intake/source/builders.py Outdated Show resolved Hide resolved
return ncinfo_dict

except Exception:
return {INVALID_ASSET: file, TRACEBACK: traceback.format_exc()}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codecov is complaining that this line isn't tested - tbh I think it's unimportant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming the equivalent lines aren't tested in the other Builders, but I might look into that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manifest files all empty - probably an accidental git add . instead of git add --update ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll need to check on Gadi tomorrow - I tried to ape the 'real' structure there as much as possible, and there may be empty manifest yamls on there. Whether it's necessary for testing or not is another matter.

Copy link
Collaborator

@charles-turner-1 charles-turner-1 Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a tool I've used which detects unused test data... I'll see if I can dig it out. I think it would be good to keep unused data out as far as is possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let me know if you find the tool. The aim was to give the build system access to 'furphy' files to make sure they weren't accidentally ingested as real data (c.f. the access-om3 test data directory).

Copy link
Collaborator

@charles-turner-1 charles-turner-1 Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done some digging and I think the tool I was thinking of was vulture, so I'm not sure that there's a way of automating checking for unused files?

I don't wanna hold up getting this merged into main - maybe we just raise a separate issue that this PR includes a lot of test data, some of it potentially unused, and come back later?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this got accidentally committed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment re: duplicating the file structure on Gadi.

@marc-white marc-white changed the title DRAFT: Add MOM6 support (om4 025jra ryf) Add MOM6 support (om4 025jra ryf) Nov 19, 2024
Copy link
Collaborator

@charles-turner-1 charles-turner-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

I still think we might want to look into reducing the amount of added test data if we can figure out whether there are redundant test data files, but I think that can be dealt with in a separate issue.

@rbeucher
Copy link
Member

I agree with @charles-turner-1. Would be good to fix codecov and pre-commit though.

@marc-white
Copy link
Collaborator Author

I've now made enough additional tests to hit the codecov requirements, so I'll merge.

@marc-white marc-white merged commit 8d18b19 into main Nov 26, 2024
18 checks passed
@rbeucher
Copy link
Member

Great effort @marc-white

charles-turner-1 added a commit that referenced this pull request Dec 4, 2024
This reverts commit 8d18b19.

Testing to see whether this restores test stability
charles-turner-1 added a commit that referenced this pull request Dec 4, 2024
This reverts commit 23b3b5f - ie. it
restores the mom6 stuff.
filename="19000101.ice_daily.nc",
file_id="XXXXXXXX_ice_daily",
filename_timestamp="19000101",
frequency="subhr",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marc-white see frequency bug we missed when merging

marc-white pushed a commit that referenced this pull request Dec 9, 2024
Fixes issues with MOM6 testing, and time parsing of same.

* Revert "Add MOM6 support (om4 025jra ryf) (#258)"

This reverts commit 8d18b19.

Testing to see whether this restores test stability

* Revert "Revert "Add MOM6 support (om4 025jra ryf) (#258)""

This reverts commit 23b3b5f - ie. it
restores the mom6 stuff.

* Add xarray complete

* Pin dependencies back in time for 3.11

* Fail fast false

* Pin a bunch of deps

* Added toxfile

* tox.ini w/ comments on failures

* Revert "Pin dependencies back in time for 3.11"

This reverts commit 6fa6676.

* These changes are ugly & horrible but mostly seem to resolve the issues with cftime. THey cause some assets to fail because they alter the parser, but I think this is a window into a solution.

* Lots of catches for overflow errors: keeping for posterity

* Restore 'test_builders.py' to same state as main

* Fix mom6 tests - should now all be failing

* Ready to replace time info guesses for MOM6 with a subclass

* Fixed broken MOM6 builder

* re-enable fast fail

* Reverted CI environments to main

* Removed '_access' from a bunch of function names - now we have GFDL models in the builders, this is misleading

* Revert load_dataset => open_dataset

* Updating test to fix coverage

* Restored to working state

* Tests for GenericTimeParser & AccessTimeParser

* Improve test coverage

* Improve test coverage for GfdlTimeParser

* Improve test coverage for GfdlTimeParser

* Improve test coverage for GfdlTimeParser

* Marc's comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

[DATA REQUEST] Add COSIMA Panantarctic / GFDL_OM4 Builder & Data
3 participants