Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cesm build of cism is very slow #53

Closed
jedwards4b opened this issue Apr 20, 2022 · 14 comments
Closed

cesm build of cism is very slow #53

jedwards4b opened this issue Apr 20, 2022 · 14 comments

Comments

@jedwards4b
Copy link
Contributor

CISM is one of the slowest builds in a CESM BMOM case.
I clocked 409 s on cheyenne in case PFS.ne30pg3z58_t061.B1850MOM.cheyenne_intel

@billsacks
Copy link
Member

billsacks commented Apr 20, 2022

Thanks for opening this issue @jedwards4b . I have noticed this as well, mainly / only with the intel compiler (gnu build times are fast – or at least, they were a year ago, when I noticed this issue with intel build times).

I noticed this got worse about a year ago:

  • From cism2_1_78: CISM build time was 182 seconds

  • From cismwrap_2_1_79: CISM build time was 378 seconds

@whlipscomb
Copy link
Contributor

@jedwards4b and @billsacks, thanks for looking at this. Adding @Katetc to the thread. I'd very much like to identify and fix the problem. I usually use the gnu compiler for code development because intel is so slow.

What's the best way to approach the issue? Are there some general rules about code structures to avoid? Or good ways to identify the offending procedures or lines of code?

@billsacks
Copy link
Member

I don't have any good strategies for approaching this. I would probably start by identifying the offending file(s) by looking at the build time of each file. I'm not sure if there's a way to get build time information for each file in the build log (@jedwards4b do you know?); if not, you could set GMAKE_J=1 then watch the build log output and see if it stalls out on a file. Assuming you can identify a problematic file, you could look at the diffs between cism2_1_78 and cismwrap_2_1_79 to see if anything looks like a likely culprit. But I'm not sure how easy it will be to identify that. I guess my hope would be that we could identify an offending file without too much trouble, and then, if we're lucky, the diffs won't be too extensive and/or there will be something fairly obviously weird about the changes in that file....

@jedwards4b do you have any suggestions for a better way to look into this? Also, I'm wondering if, before spending a lot of time on this, it would be worth trying the build with a more recent version of the intel compiler (we're using v 19 on cheyenne, so 3 years old): it may be that the problem goes away with a more recent compiler version, in which case it might not be worth spending a lot of time trying to figure this out. However, I'm also not sure how hard it would be to get the build working with a newer intel version.

@jedwards4b
Copy link
Contributor Author

So if you look at the timestamps of the object files produced I think you can get some idea of what is going on:
For example this build started at 13:13 as evidenced by the timestamp of the Filepath file:
-rw-r--r-- 1 jedwards ncar 342 Apr 20 13:13 Filepath

and ended at 13:20 with the nuopc cap file
-rw-r--r-- 1 jedwards ncar 76424 Apr 20 13:20 glc_comp_nuopc.o

It looks like most of the time was spent in compiling the glide_io file:
-rw-r--r-- 1 jedwards ncar 238870 Apr 20 13:14 glissade_velo.mod
-rw-r--r-- 1 jedwards ncar 73564004 Apr 20 13:18 glide_io.mod
-rw-r--r-- 1 jedwards ncar 449775 Apr 20 13:19 glide_stop.mod

@Katetc
Copy link
Contributor

Katetc commented Apr 20, 2022

Thanks for pointing this out guys. I've been looking at it this afternoon (starting to have more time for land ice work!) and I do see 4 minutes or so spent building glide_io.F90. This file is auto-generated at build time, but that doesn't actually seem to be the slow part. The slow part is the actual compiling of the file. Now, I know a big difference between cism2_1_78 and cismwrap_2_1_79 was the number of namelist fields. We added several new namelist and output fields between these tags. I'm not sure, but I think glide_io.F90 became much longer after this tag. And, I'm noticing this file uses a weird c def method for defining file paths:
#define NCO outfile%nc
#define NCI infile%nc

And both of these c-def variables are referenced many, many times:
if (.not.outfile%append) then
status = parallel_def_dim(NCO%id,'x0',model%parallel%global_ewn-1,x0_dimid)
else
status = parallel_inq_dimid(NCO%id,'x0',x0_dimid)
endif

This type of using c-defined objects with properties referenced is not something I've seen very often. I could see an Intel Fortran compiler (or another fortran compiler) having some issues with it.

@whlipscomb
Copy link
Contributor

@Katetc, Nice sleuthing. If you have time tomorrow, let's follow up and talk about whether we can get the same functionality without the c-def variables.

@jedwards4b
Copy link
Contributor Author

I would be surprised if the cpp macros were the cause of the slowdown.

@whlipscomb
Copy link
Contributor

@jedwards4b, is there another possible explanation?

@jedwards4b
Copy link
Contributor Author

The file glide_io.F90 is autogenerated, but that step happens very quickly. It is the fortran compile of the autogenerated file that is taking so long. I timed it at 4:47 with -O2 and 4:14 with -O0. Subroutine glide_io_create is some 7000 lines.

@whlipscomb
Copy link
Contributor

@jedwards4b, Indeed it's a long file, but there are other big files in CISM that compile in a few seconds. I'm wondering if there are specific structures in the autogenerated file that trip up the Intel compiler (but which the gnu compiler, for whatever reason, handles more efficiently). If we can identify those structures, then we may be able to modify the autogenerate script to do things differently.

@whlipscomb
Copy link
Contributor

Here's another possibility. At the end of module glide_io.F90 there are many accessor subroutines, of the form glide_set_field(data, inarray) and glide_get_field(data,outarray). Each subroutine uses four modules (glimmer_scales, glimmer_paramets, glimmer_physcon, glide_types) without an 'only' specification. Is it taking the compiler a long time to bring in the other modules? If so, we could either figure out a way to add the appropriate 'only', or do without these subroutines entirely. The used modules, especially glide_types, have grown over time.

@billsacks
Copy link
Member

It also seems possible that just having so many separate use statements could cause problems, whether or not they have an "only" clause. What about consolidating them so that they appear at the top of the module rather than being separately listed for each subroutine?

@whlipscomb
Copy link
Contributor

@billsacks, That's a good suggestion, and easy to implement. I'll give it a try.

@Katetc
Copy link
Contributor

Katetc commented Dec 11, 2023

This change was implemented in CISM PR #57 and PR #58 , both contained in cism tag cism_main_2.01.013 and included in cism wrapper tag cismwrap_2_1_97 which will be included in cesm2_3_alpha17a. Marking as addressed and closing the issue.

@Katetc Katetc closed this as completed Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants