-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MOM6 support (om4 025jra ryf) #258
Conversation
…tamp' groups in filename regexp
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #258 +/- ##
==========================================
+ Coverage 97.90% 98.44% +0.54%
==========================================
Files 11 11
Lines 811 837 +26
==========================================
+ Hits 794 824 +30
+ Misses 17 13 -4 ☔ View full report in Codecov by Sentry. |
All tests passing - just codecov that's not passing. I guess that must mean that CI environment isn't mirroring Gadi correctly.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of minor comments & a bunch of empty files I think got accidentally committed.
Otherwise looks good - the only thing I think that might warrant some extra thought is the EmptyDataError ( inmanager.py
) - I've left a comment here, I'm not sure that EmptyDataError is the most appropriate?
columns_with_iterables=COLUMNS_WITH_ITERABLES, | ||
) | ||
except EmptyDataError as e: | ||
raise EmptyDataError(str(e) + f": {self.path}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps an issue for intake-dataframe-catalog rather than here, but I feel like we might want to emit a DfFileCatalogError here rather than than an EmptyDataError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is an issue for intake-dataframe-catalog
, as you suggested. All I was trying to do is re-emit the same error with a slightly more useful message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened an issue there. Could you update the catch to
except (EmptyDataError, DfFileCatalogError) as e
so it won't break when we update it there?
return ncinfo_dict | ||
|
||
except Exception: | ||
return {INVALID_ASSET: file, TRACEBACK: traceback.format_exc()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov is complaining that this line isn't tested - tbh I think it's unimportant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming the equivalent lines aren't tested in the other Builders
, but I might look into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manifest files all empty - probably an accidental git add .
instead of git add --update
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to check on Gadi tomorrow - I tried to ape the 'real' structure there as much as possible, and there may be empty manifest yamls on there. Whether it's necessary for testing or not is another matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a tool I've used which detects unused test data... I'll see if I can dig it out. I think it would be good to keep unused data out as far as is possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let me know if you find the tool. The aim was to give the build system access to 'furphy' files to make sure they weren't accidentally ingested as real data (c.f. the access-om3
test data directory).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done some digging and I think the tool I was thinking of was vulture, so I'm not sure that there's a way of automating checking for unused files?
I don't wanna hold up getting this merged into main - maybe we just raise a separate issue that this PR includes a lot of test data, some of it potentially unused, and come back later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this got accidentally committed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above comment re: duplicating the file structure on Gadi.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
I still think we might want to look into reducing the amount of added test data if we can figure out whether there are redundant test data files, but I think that can be dealt with in a separate issue.
I agree with @charles-turner-1. Would be good to fix codecov and pre-commit though. |
I've now made enough additional tests to hit the |
Great effort @marc-white |
This reverts commit 8d18b19. Testing to see whether this restores test stability
This reverts commit 23b3b5f - ie. it restores the mom6 stuff.
filename="19000101.ice_daily.nc", | ||
file_id="XXXXXXXX_ice_daily", | ||
filename_timestamp="19000101", | ||
frequency="subhr", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marc-white see frequency bug we missed when merging
Fixes issues with MOM6 testing, and time parsing of same. * Revert "Add MOM6 support (om4 025jra ryf) (#258)" This reverts commit 8d18b19. Testing to see whether this restores test stability * Revert "Revert "Add MOM6 support (om4 025jra ryf) (#258)"" This reverts commit 23b3b5f - ie. it restores the mom6 stuff. * Add xarray complete * Pin dependencies back in time for 3.11 * Fail fast false * Pin a bunch of deps * Added toxfile * tox.ini w/ comments on failures * Revert "Pin dependencies back in time for 3.11" This reverts commit 6fa6676. * These changes are ugly & horrible but mostly seem to resolve the issues with cftime. THey cause some assets to fail because they alter the parser, but I think this is a window into a solution. * Lots of catches for overflow errors: keeping for posterity * Restore 'test_builders.py' to same state as main * Fix mom6 tests - should now all be failing * Ready to replace time info guesses for MOM6 with a subclass * Fixed broken MOM6 builder * re-enable fast fail * Reverted CI environments to main * Removed '_access' from a bunch of function names - now we have GFDL models in the builders, this is misleading * Revert load_dataset => open_dataset * Updating test to fix coverage * Restored to working state * Tests for GenericTimeParser & AccessTimeParser * Improve test coverage * Improve test coverage for GfdlTimeParser * Improve test coverage for GfdlTimeParser * Improve test coverage for GfdlTimeParser * Marc's comments
Closes #175 .
This PR adds the data requested from #175 , which required a new builder: MOM6Builder.
PR includes relevant builder, translator, and tests.