Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calc_analysis fix for delp increment read to allow C1152 cycling (#56) #57

Merged
merged 1 commit into from
Dec 17, 2024

Conversation

CatherineThomas-NOAA
Copy link
Contributor

When running calc_analysis for C1152, the job fails in the read of the delp increment:

call read_vardata(incncfile, 'delp_inc', work3d_inc)

This read has a different format than the other 3d increment reads:

call read_vardata(incncfile, trim(incvar)//"_inc", work3d_inc_gsi, nslice=k, slicedim=3)

Adding nslice=k, slicedim=3 to the delp_inc read along with a do loop over k allows the job to succeed.

This fix was previously tested for 3 full cycles on WCOSS2 for a C1152/C384 ATM-only experiment.

@CatherineThomas-NOAA
Copy link
Contributor Author

Tagging @CoryMartin-NOAA for awareness

@CoryMartin-NOAA CoryMartin-NOAA merged commit 7aed28e into NOAA-EMC:develop Dec 17, 2024
2 of 4 checks passed
@RussTreadon-NOAA
Copy link
Contributor

@CatherineThomas-NOAA , I ran 20211221 06Z gdas_analcalc using g-w CI C96_atm3DVar. The first run used the original code. The second run used the modified code from this PR.

With the original code the minval and maxval of array ps_inc is (-93.6521, 66.0706). With the modified code, the (min,max) is (0.00000 , 0.00000).

work3d_inc(:,:,nlev) contains 0.00000 in the modified code. It contains a (min,max) of (-0.245417, 0.173139) in the original code. In the modified code, work3d_inc(:,:,1) contains a (min,max) of (-0.245417, 0.173139). All other levels of work3d_inc are 0.00000.

Given this, the modified code should have

 ps_inc(:,:) = work3d_inc(:,:,1) / (bk5(nlev) - bk5(nlev-1))

... though this looks weird.

I have not yet looked at the source code for read_vardata to see how it works.

What did your tests find?

@CoryMartin-NOAA
Copy link
Contributor

oh no, I pulled the merge trigger too quickly, sorry about that and thanks @RussTreadon-NOAA for testing more thoroughly

@CoryMartin-NOAA
Copy link
Contributor

Looking more closely at this, why did the job fail at 1152? is it running out of memory? other reasons?

@CatherineThomas-NOAA
Copy link
Contributor Author

Sorry I should have marked this as draft. This was part of my frantic push to get what I needed off of WCOSS2 before it went down for the week. I did not do any additional testing other than C1152 and that test was also fraught with other issues, so I did not take a good look at things science-wise.

It might be memory related but I'm not sure. It didn't give a descriptive error initially:

 Computing and Adding Surface Pressure Increment
nid001936.dogwood.wcoss2.ncep.noaa.gov: rank 125 died from signal 9
Error with calc_analysis.x for deterministic resolution, exit code=137

I added print statements and found that most pe's died during the read_vardata call, but some of them made it through. Since I noticed that the other read_vardata calls had the slices, I tried adding them and the job plus the next few cycles succeeded. I wasn't really sure if this was the right thing to do or if there was another reason why delp_inc wasn't read this way, so that's why I pinged @CoryMartin-NOAA, but I should have been more explicit.

What do you think @CoryMartin-NOAA?

@CoryMartin-NOAA
Copy link
Contributor

I think what we need to do is something like this (if it's memory related):

call read_vardata(incncfile, 'delp_inc', work2d_inc, nslice=nlev, slicedim=3)
ps_inc(:,:) = work2d_inc(:,:) / (bk5(nlev) - bk5(nlev-1))

As I believe the increment is only the lowest layers' delp increment scaled by the bk value differences.

@RussTreadon-NOAA
Copy link
Contributor

@CoryMartin-NOAA 's suggestion works ... need to declare and deallocate work2d_inc and remove work3d_inc.

@CatherineThomas-NOAA
Copy link
Contributor Author

Thanks @CoryMartin-NOAA and @RussTreadon-NOAA. Should the fix go into PR #58?

@CoryMartin-NOAA
Copy link
Contributor

Yea, let's close the revert and open a 'bugfix' branch/PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Calc_analysis read fix for C1152
3 participants