Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ripple and LOFI] Testing workflows to process data for publication #1051

Open
LorneLeonard-NOAA opened this issue Jan 21, 2025 · 1 comment
Assignees
Labels
Milestone

Comments

@LorneLeonard-NOAA
Copy link

Is your feature request related to a problem? Please describe.
Need strategies to handle large volumes of data and preprocessing data for publication services.
Require baseline benchmarks for performance evaluation.
Understanding our constraints here will guide our downstream and upstream workflows.

Describe the solution you'd like
Fast and scalable.

@LorneLeonard-NOAA
Copy link
Author

LorneLeonard-NOAA commented Jan 21, 2025

Using the gSSURGO national dataset for baseline benchmarks. As the gSSURGO has complete coverage of lower 48 states with ~300K polygons.

By using gSSURGO,

  • Understand performance from a non-trivial example (~21GB GDB) (not including extraction etc).
  • Can evaluate what OS environment to use.
  • Understand polygon performance with a reasonable large number of polygons.
  • Understand domain decomposition strategies/performance.
  • Strategies to handle polygon "correctness", CCW, CW, holes etc.

@LorneLeonard-NOAA LorneLeonard-NOAA added this to the V2.x.x milestone Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant