-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to subset multiple files in a single call #40
base: master
Are you sure you want to change the base?
Conversation
This commit adds a new method to the MeshHandler that checks if a loaded mesh contains mesh connectivity dimensions or variables. This will enable other commits to subset multiple MPAS files from a single grid.
@mgduda, here are the changes that allow There are a number of commits, but most are all well self contained and small and straightforward. So hopefully you don't have much trouble understanding whats being changed, but if you do please feel free to ask. I think the only change here that I have questions on is the file name prepending. Since I've updated the @ldfowler58, here is a primarily set up changes that allow MPAS-Limited-Area program to subset multiple grids and those without mesh connectivity information. If you'd like, it would great if you can give it a test and report back with any problems or suggestions, but I'd hold off on using it for any hard science before it gets fully reviewed and released. |
Hi Miles:
Will give it a try sometimes tomorrow. Thanks for your help with this.
Laura
…On Sun, Mar 15, 2020 at 1:11 PM Miles Curry ***@***.***> wrote:
@mgduda <https://github.com/mgduda>, here are the changes that allow
create_region to subset multiple mesh files, with the first containing
mesh connectivity information. Would you be able to give it a review and
run it through some tests when its convenient for you?
There are a number of commits, but most are all well self contained and
small and straightforward. So hopefully you don't have much trouble
understanding whats being changed, but if you do please feel free to ask.
I think the only change here that I have questions on is the file name
prepending. Since I've updated the create_region_name to just prepend and
not figure out the filename, if one creates a subset of an MPAS file that
contains nCells (i.e. x1.10242.grid.nc, x1.10242.static.nc, etc.) the
region name would become: conus.x1.10242.grid.nc, which gives the
impression that conus.x1.10242.grid.nc has 10242 cells.
@ldfowler58 <https://github.com/ldfowler58>, here is a primarily set up
changes that allow MPAS-Limited-Area program to subset multiple grids and
those without mesh connectivity information. If you'd like, it would great
if you can give it a test and report back with any problems or suggestions,
but I'd hold off on using it for any hard science before it gets fully
reviewed and released.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#40 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA7LEXTAALHMWCYO4QONWS3RHUR6LANCNFSM4LLFIXDA>
.
--
!-------------------------------------------------------------------------------------------------------------
Laura D. Fowler
Mesoscale and Microscale Meteorology Division (MMM)
National Center for Atmospheric Research
P.O. Box 3000, Boulder CO 80307-3000
e-mail: [email protected]
phone: 303-497-1628
!-------------------------------------------------------------------------------------------------------------
|
@ldfowler58 I have just pushed an updated to this branch with changes that fixed the memory problem that you were having with your meshes when trying to subset more than one mesh at a single time. I have tested created a And @mgduda did you want to give this PR a review and a test? If not, I'm happy to review it and merge it myself. |
This commit updates the argparser of the create_region script to take in multiple files rather than just one.
Before this commit, when a new MeshHandler instance was initialized, and was successfully loaded, MeshHandler._load_vars() was called automatically. This automatic call needs to be removed so that meshes that do not contain connectivity information do not try to load variables they do not contain.
Update _load_vars to be public, so it can be called from other instances.
This commit updates the LimitedArea class to receive multiple mesh files; however, it only will subset the first mesh passed to it (assuming it has mesh connectivity fields). This commit does not enable the ability to subset multiple grids, but a future commit can later implement that feature.
With this commit, LimitedArea no longer tries to determine the 'type' of a mesh file. It simply prepends the specified region name to the mesh filename.
This commit adds the ability to MeshHandler.subset_mesh to use fields of another MeshHandler instance. This allows a mesh, that perhaps does not have mesh connectivity information, to be subsetted by another.
This change allows the limitedArea.gen_region to subset multiple files.
The code removed in this commit was in place to try to achieve efficiency by pre-loading the memory of the file before loading it with netCDF4 Dataset. With the ability to subset multiple meshes in one call, this would cause a bus error if the meshes were significantly large, as each mesh would be loaded at once, rather than when it was needed. At the end of `LimitedArea.gen_region()`, after the call to `Mesh.subset_fields()`, the created mesh region file AND the mesh that was used to created are both closed. This ensures that only one mesh (and its regional subset) are loaded at once.
print("ERROR: Mesh file was not found", mesh_file) | ||
sys.exit(-1) | ||
|
||
if len(files) == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add a bit more comments here, just short one liners.
sys.exit(-1) | ||
self.meshes.append(self.mesh) | ||
|
||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a new comment here as well
else: | ||
print("ERROR:", self.mesh.fname, "did not contain needed mesh connectivity information") | ||
print("ERROR: The first file must contain mesh connectivity information") | ||
sys.exit(-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this error worthwhile? And is the error message correct? This if statement doesn't seem correct.
sys.exit(-1) | ||
|
||
for mesh in files: | ||
self.meshes.append(MeshHandler(mesh, 'r', *args, **kwargs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be moved to replace line 89?
@@ -80,6 +110,7 @@ def gen_region(self, *args, **kwargs): | |||
|
|||
# For each mesh, create a regional mesh and save it | |||
print('\n') | |||
# TODO: Update this print statement to be more consistent to what is happening | |||
print('Creating a regional mesh of ', self.mesh.fname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO should be taken care of. Should this print statement be moved?
# Create subsets of all the meshes in self.meshes | ||
print('Subsetting meshes...') | ||
for mesh in self.meshes: | ||
print("\nSubsetting:", mesh.fname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change this print message to say print("Subsetting:", mesh.fname, "using", mesh.fname)
.
self.mesh.copy_global_attributes(regionalMesh) | ||
# Create subsets of all the meshes in self.meshes | ||
print('Subsetting meshes...') | ||
for mesh in self.meshes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop is a bit confusing. mesh from the loop is confusing my brain with self.mesh (the file used to subset self.meshes).
bdyMaskVertex, | ||
self.INSIDE, | ||
self.UNMARKED, | ||
self.mesh, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add a note that self.mesh is being used to subset mesh.
**kwargs) | ||
print("Copying global attributes... ") | ||
regionalMesh.copy_global_attributes(self.mesh) | ||
print("Create a regional mesh:", regionFname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create -> Created
glbBdyVertexIDs = self.indexToVertexIDs[np.where(bdyMaskVertex != unmarked)] - 1 | ||
glbBdyCellIDs = mesh.indexToCellIDs[np.where(bdyMaskCell != unmarked)] - 1 | ||
glbBdyEdgeIDs = mesh.indexToEdgeIDs[np.where(bdyMaskEdge != unmarked)] - 1 | ||
glbBdyVertexIDs = mesh.indexToVertexIDs[np.where(bdyMaskVertex != unmarked)] - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix this white space, they aren't necessary
region.mesh.createVariable('bdyMaskCell', 'i4', ('nCells',)) | ||
region.mesh.createVariable('bdyMaskEdge', 'i4', ('nEdges',)) | ||
region.mesh.createVariable('bdyMaskVertex', 'i4', ('nVertices')) | ||
if self.check_grid(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this check necessary? Hasn't the MeshHandler instance in self already been checked by check_grid() before subset_grid is called?
These commits contain a number of changes that allow the MPAS-Limited-Area program to subset multiple mesh files in a single call. The usage of
create_region
remains the same, except that any number of mesh files can be given.If multiple files are given, then the first mesh file must contain the needed mesh connectivity fields. The first mesh file's connectivity fields will be used to subset the remaining files. If the first file does not contain mesh connectivity fields, then an error will occur.