Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature #3024 and #3030 Series-Analysis GRAD #3036

Open
wants to merge 19 commits into
base: develop
Choose a base branch
from

Conversation

JohnHalleyGotway
Copy link
Collaborator

@JohnHalleyGotway JohnHalleyGotway commented Dec 12, 2024

This pull request is for enhancements described in issue MET#3024 and MET#3030. I originally did the changes for MET#3024 on a branch named feature_3024_GRAD and then created this new feature_3030_series_analysis_GRAD branch from the feature_3024_GRAD. I am combining them into one PR to make the review process more efficient.
This PR includes all the following changes:

For MET#3024:

  • Adds 4 new columns to the GRAD line type written by Grid-Stat: FGMAG, OGMAG, MAG_RMSE, LAPLACE_RMSE
  • Updates Stat-Analysis to parse the new columns when reading GRAD lines.
  • Updates the documentation:
    • Adds 4 new rows to the GRAD line type table in the Grid-Stat chapter.
    • Adds equations to Appendix C to define their computaiton.
    • Adds reference to the DRAFT PAPER about sharpness (Please advise if/how this reference should be updated!).

For MET#3030:

  • Adds new gradient dictionary and output_stat.grad entry to the default Series-Analysis config file.
  • Updates all Series-Analysis config files with these changes.
  • Updates logic of Series-Analysis to compute gradient statistics for each gradient requested in the gradient dictionary.
  • Updates the documentation:
    • Moves description of the gradient dictionary from the Grid-Stat chapter to the "common config entries" chapter.
    • Notes that output_stats.grad can be set to "ALL" to facilitate aggregation across multiple runs.
  • Updates the testing by re-configuring an existing config test for precip to request that 2 gradients (sizes 1 and 3) by computed. Note that gradient stats aren't all that great for precip, but that the missing data and 0 values makes it really good for software testing.

Expected Differences

  • Do these changes introduce new tools, command line arguments, or configuration file options? [Yes]

    If yes, please describe:

    In Series-Analysis config file, adds new gradient dictionary and output_stats.grad option.

  • Do these changes modify the structure of existing or add new output data types (e.g. statistic line types or NetCDF variables)? [Yes]

    If yes, please describe:

  • Adds 4 new columns (FGMAG, OGMAG, MAG_RMSE, LAPLACE_RMSE) to the end of the existing GRAD line type, written by Grid-Stat.

  • Enhances Series-Analysis to compute/write GRAD statistics to its NetCDF output.

Pull Request Testing

  • Describe testing already performed for these changes:

    Manually ran Grid-Stat to confirm the logic for computing GRAD stats in a single run, using the existing unit tests.
    Manually ran Series-Analysis to confirm the logic for GRAD stats in a single run, plus aggregating them across multiple ones.

  • Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:

  • Several things:

    • Confirm with @bgbrowntollerud that the implementation in MET matches the logic described in the source paper. And that the new equations in Appendix C are correct.
    • Inspect the differences flagged by the regression test for this PR to confirm that the modified output from Grid-Stat and Series-Analysis make sense... and that all differences are expected.
    • Review the documentation updates for clarity and accuracy.
  • Please find this feature branch compiled/available for testing on seneca in:

/d1/projects/MET/MET_pull_requests/met-12.1.0/beta1/MET-feature_3030_series_analysis_GRAD/bin
  • Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes]

  • Do these changes include sufficient testing updates? [Yes]
    Adds no new tests, but reconfigures existing ones which causes differences in the output.

  • Will this PR result in changes to the MET test suite? [Yes]

    If yes, describe the new output and/or changes to the existing output:

  • 4 new columns added to all instance of the GRAD line type.

  • Modified output from 2 Series-Analysis runs that now include new gradient output variables.

Note that I inspected the differences flagged in this GHA testing workflow run. Differences exist in the following 9 files:

egrep -i "file1:|file2:|ERROR" comp_dir.log  | egrep -i -B 2 ERROR | grep file1 | cut -d':' -f2
 /data/output/met_test_truth/climatology_1.5deg/grid_stat_WMO_CLIMO_1.5DEG_240000L_20120410_000000V.stat
 /data/output/met_test_truth/grid_stat/grid_stat_GRIB1_NAM_STAGE4_120000L_20120409_120000V.stat
 /data/output/met_test_truth/grid_stat/grid_stat_GRIB1_NAM_STAGE4_120000L_20120409_120000V_grad.txt
 /data/output/met_test_truth/met_test_scripts/grid_stat/grid_stat_120000L_20050807_120000V.stat
 /data/output/met_test_truth/met_test_scripts/grid_stat/grid_stat_120000L_20050807_120000V_grad.txt
 /data/output/met_test_truth/met_test_scripts/stat_analysis/job_aggregate_GRAD.stat
 /data/output/met_test_truth/met_test_scripts/stat_analysis/stat_analysis.out
 /data/output/met_test_truth/series_analysis/series_analysis_AGGR_CMD_LINE_APCP_06_2012040900_to_2012041018.nc
 /data/output/met_test_truth/series_analysis/series_analysis_CMD_LINE_APCP_06_2012040900_to_2012041000.nc
  • I used vimdiff on seneca to look through the diffs in all .txt and .stat files and confirmed that they're all due to the 4 new columns being added to the end of the GRAD line type.
  • For the NetCDF Series-Analysis output, I see that the existing TRUTH output has 50 gridded fields and the updated OUTPUT now has 78, with 28 being added by setting output_stat.grad = "ALL" in the config file. That 14 columns in the GRAD line type (TOTAL ... LAPLACE_RMSE) x 2 gradients: dx,dy = (1,1) and (3,3).
> ncdump -h series_analysis/series_analysis_CMD_LINE_APCP_06_2012040900_to_2012041000_TRUTH.nc  | grep "float series" | wc -l
50
> ncdump -h series_analysis/series_analysis_CMD_LINE_APCP_06_2012040900_to_2012041000_OUTPUT.nc  | grep "float series" | wc -l
78
  • So all of these differences are consistent with the code changes for this PR.

  • Will this PR result in changes to existing METplus Use Cases? [Yes]

    If yes, create a new Update Truth METplus issue to describe them.
    The output from METplus use case that writes the GRAD line type will also change.

  • Do these changes introduce new SonarQube findings? [No]

    If yes, please describe:
    The current develop branch flags 18,253 code smells overall.
    After making some changes to fix easy ones, I was able to reduce them in the feature_3030_series_analysis_GRAD branch down to 18,173 overall.

  • Please complete this pull request review by [Friday 1/17/25].

Pull Request Checklist

See the METplus Workflow for details.

  • Review the source issue metadata (required labels, projects, and milestone).
  • Complete the PR definition above.
  • Ensure the PR title matches the feature or bugfix branch name.
  • Define the PR metadata, as permissions allow.
    Select: Reviewer(s) and Development issue
    Select: Milestone as the version that will include these changes
    Select: Coordinated METplus-X.Y Support project for bugfix releases or MET-X.Y.Z Development project for official releases
  • After submitting the PR, select the ⚙️ icon in the Development section of the right hand sidebar. Search for the issue that this PR will close and select it, if it is not already selected.
  • After the PR is approved, merge your changes. If permissions do not allow this, request that the reviewer do the merge.
  • Close the linked issue and delete your feature or bugfix branch from GitHub.

…new columns to the existing GRAD line type.
…Stat to the common area and then referencing it in both Grid-Stat and Series-Analysis.
…dictionary and an entry for output_stats.gradient. Update the conf_info source code to parse them. Still need to update OTHER Series-Analysis config files and also update the logic in series_analysis.cc to compute GRAD statistics.
…ong_name attribute of the Series-Analysis output files.
…crementally across multiple runs. However, this can only be done when requesting that 'ALL' GRAD columns be written.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🔎 In review
1 participant