Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hocn calculation and flushing velocity issue #504

Merged

Conversation

davidclemenssewall
Copy link
Contributor

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

  • Short (1 sentence) summary of your PR:
    Fixes two small bugs related to how the mass of melt pond water impacts melt pond drainage and synchronizes 'apnd' nomenclature.
  • Developer(s):
    @davidclemenssewall
  • Suggest PR reviewers from list in the column to the right.
    @dabail10 @eclare108213 @apcraig
  • Please copy the PR test results link or provide a summary of testing completed below.
    Ran the Icepack base_suite on conda_linux with a baseline comparison to a baseline from the latest 'main' branch. All tests passed except the comparison tests for cases that used mushy thermodynamics and level or topo ponds (see below). These tests were expected to fail due to the nature of this code change. I have not been able to test the impacts of these changes in CICE because the current icepack main branch fails to compile in CICE.
    FAIL conda_linux_smoke_col_1x1_diag1_run1year compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_run1year compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_bgcispol_debug compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_bgcnice_debug compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_zaero compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_run1year_swredist compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_dt30min_leap_run1year compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_dyn_run1year compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_fsd12_run1year_short compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_fsd1_run1year compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_run1year_snw30percent_snwgrain compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_run1year_snwitdrdg compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_calcdragio_debug_run1year compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_modal_run1year compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_fluxopenw_run1year compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_dyn_fluxopenw_run1year compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_run1year_snicartest compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_debug_run1year_snicar compare icepack.main different-data
    FAIL conda_linux_smoke_col_1x1_congel_debug_run1year compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_debug compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_diag1 compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_pondlvl compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_pondtopo compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_bgcispol compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_bgcnice compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_snwgrain_snwitdrdg_zaero compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_isotope compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_saltflux compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_dyn compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_fsd12_short compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_snwgrain_snwitdrdg compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_modal compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_fluxopenw compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_snicartest compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_snicar compare icepack.main different-data
    FAIL conda_linux_restart_col_1x1_congel compare icepack.main different-data
    36 of 215 tests FAILED
  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on CICE or any other models?
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please document the changes in detail, including why the changes are made. This will become part of the PR commit log.
    Currently, there are two bugs in how icepack computes the height of the ocean above the ice base in icepack_therm_mushy.F90 line 3141:

hocn = (ice_mass + hpond * apond * rhow + hsn * rhos) / rhow

First, rhow is the density of ocean water, but ponds that depress the ice surface are generally found to be fresh water. Thus, rhow in the numerator should be replaced by rhofresh.
Second, apond here is the category pond area tracer, however it is being treated as the category pond area fraction (see footnote below). For the level-ice scheme, the category pond area fraction is apnd*alvl.
This PR fixes both issues. There were also offsetting errors in the flushing_velocity and flush_pond subroutines that have been fixed.

I believe that the second bug was made more likely by using inconsistent naming conventions for the pond area tracer and pond area fraction. In icepack_meltpond_lvl.F90 and other parts of the code base, it appears that the general convention is to use apnd to denote the pond area tracer and apondn for the category or full-grid-cell pond area fraction (i.e., apondn = apnd*alvl for the level pond scheme). I propose that we follow this convention throughout the code base, such that apnd always refers to the pond area tracer. This PR makes these changes in the mushy and vertical thermodynamics modules.

Footnote: there is an alternate interpretation of using just the pond area tracer in the numerator for the level pond scheme which is that the level areas of an ice category separated from the deformed areas by borders of zero flexural rigidity (i.e., the thicker, pond-free, deformed ice cannot provide any upward buoyancy force to the level, ponded areas). If this is the intention, then I think we need to also use the level ice volume (vlvl) in the calculation of ice mass and the level-ice snow depth for consistency.

@davidclemenssewall davidclemenssewall changed the title fixed hocn and flushing velocity issue, passes basic restart test hocn calculation and flushing velocity issue Nov 9, 2024
@dabail10
Copy link
Contributor

dabail10 commented Nov 11, 2024

Here are the results of the QC test. Everything passes. This will only change answers when ponds are on. Yes, level ponds are on in the two simulations even though they don't really look that different.

ice_thickness_derecho_intel_smoke_gx1_44x1_medium_qc qc_base

ice_thickness_derecho_intel_smoke_gx1_44x1_medium_qc qc_te

ice_thickness_derecho_intel_smoke_gx1_44x1_medium_qc qc_te_minus_derecho_intel_smoke_gx1_44x1_medium_qc qc_base

if (tr_pond_lvl) then
w_down_max = (hpond * apnd * alvl) / dt
else
w_down_max = (hpond * apnd) / dt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix indentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -107,7 +107,7 @@ subroutine thermo_vertical (dt, aicen, &
congel, snoice, &
mlt_onset, frz_onset, &
yday, dsnow, &
prescribed_ice)
prescribed_ice, alvl)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest moving alvl up in the calling arguments. Makes no difference technically, but seems like it belongs with apnd and hpond, for instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@apcraig
Copy link
Contributor

apcraig commented Dec 4, 2024

Is this ready to merge? Is any other discussion needed?

@davidclemenssewall
Copy link
Contributor Author

I believe this is ready to merge but please let me know if further input is needed.

@eclare108213 eclare108213 self-requested a review December 5, 2024 17:15
@eclare108213
Copy link
Contributor

I'm sorry I haven't had a chance to look at this carefully until now. I agree with the change from rhow to rhowfresh. Thanks for noticing that.

The naming convention for ponds is that apnd is the tracer and apond is the area fraction, with apond=apnd*alvl for the level-pond scheme and apond=apnd for the others, as you point out. I disagree with sending both apnd and alvl into the physics routines and then using if blocks to distinguish among the pond schemes. We should be sending the same physical quantity (pond fraction) into the physics routines -- apond is (or should be) used in all of those. Also, hpond and apond should be computed and used consistently throughout the code.

There might still be a bug, e.g. if we aren't actually sending in apond when needed. But be careful with whether the fraction is of the grid cell area or of the ice. The variable names might not be consistently defined.

I think you already understand the following -- I'm writing it out to remind myself how it works and to make sure we are on the same page:

The thermo routines act on whole categories of ice thickness, i.e. the full ice area fraction covered by each thickness category; the category column is considered uniform (a single point in space) for purposes of the vertical thermo calculation. A fraction of the category area is ridged; a fraction is level; a fraction is snow-covered; a fraction has melt ponds on it, etc. Since the thermodynamics calculation is for the whole category area, we average over the relevant fractions as needed.

The area tracers are carried on area fractions of things. For topo ponds, the melt pond tracer apndn is the fraction of the ice in a category; sum(apndn*aicen) over n categories is the pond fraction of the full grid cell area. Level-ice ponds are carried on only the level portion of the ice; in that case the tracer apndn is the fraction of the level ice and apndn*alvln is the fraction of the ice (apond).

Your changes are consistent with this, which then prompts the question of which melt pond fraction is being sent in at the top level. If it is wrong, then I would prefer that it be fixed at a higher level, probably in the icepack_step_* routines (but check what the icepack and cice drivers are doing). Is there an issue with doing that instead of modifying the lower-level routines? I looked back at older versions of the code (9+ years ago) and the subroutine arguments don't seem to have changed functionally, although it's difficult to tell because the code has been refactored and interfaces changed a lot since then.

Re naming conventions: we try to stick to them, but we usually drop the category n when deep in the (vertical) physics routines. We continue to clean up from 30 years of coding...

To sum up, we should definitely fix the water density being used and there may well be a bug in the pond area used for level-ice ponds. After my brief look at the code, I tend to agree with you that there is. If so, I'd expect the new code to have smaller pond areas and so less pond influence on the ice pack.

@davidclemenssewall
Copy link
Contributor Author

That all makes sense. I have refactored the code to move the calculation of apond (alvl*apnd for the level ice, else apnd) into icepack_step_therm1 and pass apond into the physics routines. While doing this, I discovered a related bug in how the formdrag uses the pond fraction (basically, it uses apnd as if it was apond), I will fix that in the next commit.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davidclemenssewall. As long as you're checking apnd/apond, are you also checking hpnd/hpond to make sure those average depths are correct?

columnphysics/icepack_therm_mushy.F90 Outdated Show resolved Hide resolved
@davidclemenssewall
Copy link
Contributor Author

Most recent commit now passes apond into neutral_drag_coeffs. I also removed hpnd and ipnd from that subroutine since they are not used in it.

@davidclemenssewall
Copy link
Contributor Author

Thanks @davidclemenssewall. As long as you're checking apnd/apond, are you also checking hpnd/hpond to make sure those average depths are correct?

I have not done an exhaustive check, but I have not encountered any errors with hpnd/hpond. Is there a semantic distinction between 'hpnd' and 'hpond'? Within the icepack code itself it looks to me like they are used interchangeably and they always refer to the depth of the water within the area covered by ponds. As far as I'm aware it is only in the CICE history driver that we convert into the mean depth of the water within the entire grid cell, averaging together ponded and unponded ice.

@eclare108213
Copy link
Contributor

they always refer to the depth of the water within the area covered by ponds.

Good, that makes sense. The pond depth should be defined as a tracer on the pond area. Thanks.

@apcraig
Copy link
Contributor

apcraig commented Dec 16, 2024

I'm running some test suites on derecho now with Icepack and CICE. If things look OK, I'll approve and merge unless I hear otherwise. Do we need to rerun QC or do anything else to check results given most recent changes?

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with just some minor cleanup needed. I think full testing with the test suites is sufficient, without additional QC runs. Thank you!

columnphysics/icepack_therm_mushy.F90 Outdated Show resolved Hide resolved
columnphysics/icepack_therm_mushy.F90 Outdated Show resolved Hide resolved
columnphysics/icepack_therm_mushy.F90 Outdated Show resolved Hide resolved
@apcraig
Copy link
Contributor

apcraig commented Dec 16, 2024

This test fails in CICE, derecho_intel_restart_gbox128_8x1_diag1,

 (abort_ice)ABORTED: 
 (abort_ice) called from ice_history_write.F90
 (abort_ice) line number          958
 (abort_ice) error = 
 (ice_check_nc) NetCDF: Numeric conversion not representable, (ice_write_hist) ERROR: writing variable apond

cice               000000000061CD14  ice_gather_scatte         232  ice_gather_scatter.F90
cice               00000000007B48A3  ice_history_write         949  ice_history_write.F90
cice               000000000072C8F7  ice_history_mp_ac        4134  ice_history.F90
cice               0000000000412A15  cice_initmod_mp_c          54  CICE_InitMod.F90
cice               000000000040DDD5  MAIN__                     43  CICE.F90

This is writing out the history initial condition file. It does not fail on startup, just on restart. And just for this gbox128 case, not for other gx3 cases that I've tried (still more cases to test). My guess is that apond is being read off the restart and contains NaNs or something. And that probably happens because some physics is off for this case and the apond restart field had bad data. Or something like it. @davidclemenssewall, can you look into this. I can too if you need me to. Thanks.

@davidclemenssewall
Copy link
Contributor Author

This test fails in CICE, derecho_intel_restart_gbox128_8x1_diag1,

 (abort_ice)ABORTED: 
 (abort_ice) called from ice_history_write.F90
 (abort_ice) line number          958
 (abort_ice) error = 
 (ice_check_nc) NetCDF: Numeric conversion not representable, (ice_write_hist) ERROR: writing variable apond

cice               000000000061CD14  ice_gather_scatte         232  ice_gather_scatter.F90
cice               00000000007B48A3  ice_history_write         949  ice_history_write.F90
cice               000000000072C8F7  ice_history_mp_ac        4134  ice_history.F90
cice               0000000000412A15  cice_initmod_mp_c          54  CICE_InitMod.F90
cice               000000000040DDD5  MAIN__                     43  CICE.F90

This is writing out the history initial condition file. It does not fail on startup, just on restart. And just for this gbox128 case, not for other gx3 cases that I've tried (still more cases to test). My guess is that apond is being read off the restart and contains NaNs or something. And that probably happens because some physics is off for this case and the apond restart field had bad data. Or something like it. @davidclemenssewall, can you look into this. I can too if you need me to. Thanks.

I think that the most likely source of the error is when we convert from apond back to apnd we divide by alvl. Perhaps in that test there are some extremely small values of alvl? I added an if block to zero out apnd if alvl <= puny. Would you mind rerunning the failing test?

@apcraig
Copy link
Contributor

apcraig commented Dec 16, 2024

Thanks @davidclemenssewall, latest update fixes the problem and works with both debug flags on and off, so that's good. Will continue testing and keep everyone posted.

@apcraig
Copy link
Contributor

apcraig commented Dec 17, 2024

@apcraig apcraig merged commit 43ead56 into CICE-Consortium:main Dec 17, 2024
2 checks passed
apcraig added a commit to CICE-Consortium/CICE that referenced this pull request Dec 17, 2024
… bfb (#1002)

Update Icepack to #43ead56380b, hocn and flushing velocity issue, not bfb.

See CICE-Consortium/Icepack#504
@davidclemenssewall davidclemenssewall deleted the hocn_mushy_thermo_fix branch December 18, 2024 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants