Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix azimuth, split same-date ARs, new output CSVs, finish QC notebook #18

Merged
merged 32 commits into from
Sep 11, 2023

Conversation

Joshdpaul
Copy link
Collaborator

@Joshdpaul Joshdpaul commented Aug 25, 2023

This (resubmitted) PR:

  • deletes the 'AR_pipeline' notebook
  • fixes the azimuth calculation for timestep ARs by using the geographic azimuth of the major axis endpoints, rather than the bottom-left and top-right corners of the bounding box. This allows for NW oriented shapes and slightly decreases the number of detected ARs.
  • during event aggregation, non-overlapping event polygons that occur on the same date are treated as distinct events in the output shp. (This allows for more than one event instance on the same date in the output shp)
  • computes a coastal impact point based on an event's mean IVT direction, and adds a coastal impact point output
  • adds shapefile column metadata CSV file for each shp output
  • adds a log CSV output that records the config, so that model run parameters can be saved alongside the outputs
  • QC notebook is finished, and includes some heatmap style plots that include ENSO/PDO data plotted for context (spoiler alert, no big trends showqing!)
  • All QC functions have been moved to their own module to clean up the QC notebook and allow a more flexible workflow.
  • added more detail to the bibliography
  • uses a new virtual environment "jp_ar_avalanche" built from the updated environment.yml file

This PR closes #16 , closes #17 , and closes #10 . I couldn't find a better CRS to use than EPSG:3338, and since the spatial analysis with avalanche events will focus on Alaska, I think we just go with that (closes #7 ).

This PR makes no changes to download.py. So if you have already downloaded the data, you can skip that step. However, the compute_ivt.py should be run again in order to use the 90th percentile IVT.

Instructions: The README and config.py files should provide the basic setup for testing this PR. To test:

  • set your env var export AR_DATA_DIR=...
  • create your conda env from the environment.yml and activate it
  • delete any previous shp and csv outputs from previous model runs
  • inspect the config.py file, making sure to use 90 for ivt_percentile
  • python download.py (optional)
  • python compute_ivt.py
  • 'python ar_detection.py'
  • check out theAR_QC.ipynb notebook

Note that Nathan requested data from 1980 to present, to match up with the date range of his avalanche database. In the interest of time, I'd like you to just test this PR using the data already downloaded; I will run the whole pipeline from 1980-present and package that data for Nathan.

@Joshdpaul
Copy link
Collaborator Author

Hey @charparr and @kyleredilla - I pushed a lot more changes to this fix_azimuth branch. Barring any major faults you may find, I think this should be the last major push for this project! If it tests well, I will be ready to run the pipeline from 1980 onwards and deliver the data to John & Nathan.

I think the merge conflict here is entirely due to the fact that I deleted one of the notebooks. 🤞 If we need to work through that together, please let me know!

Copy link
Member

@charparr charparr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I successfully executed the pipeline here! I performed the following steps:

  • Resolved merge conflicts
  • Created a fresh environment using the mamba package manager. I asked for the following packages:
xarray
pyproj
geopandas
pandas
tqdm
scipy
skimage
haversine
shapely
rasterio
cdsapi
matplotlib
seaborn
jupyterlab
dask
scikit-image
rioxarray

I did this because creating the environment from the existing .yml file was taking forever - this is a known conda quirk and is the raison d'être for alternative package managers like mamba.

  • Executed the download.py script - while not strictly needed for this PR, I did this because I was working with a different env
  • Executed the compute_ivt.py script without providing arguments (i.e. using the defaults). This computed 90th percentile IVT values.
  • Executed the ar_detection.py script without providing arguments (i.e. using the defaults).
  • Executed the AR_QC.ipynb notebook
  • Executed the AR_avalanche_exploration.ipynb notebook
    All of these scripts and notebooks executed successfully with zero issues!

This is a great body of work and as far as I can tell it is ready for a "production" run, meaning running the pipeline from 1980 onward and then sharing results with our collaborators.

The only change I'll request here is to settle the python environment to converge on a more easily resolvable package spec.

Copy link
Member

@charparr charparr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll add that list of packages to the README and in the subsequent branch do the following to settle the env:

  • Create a new conda env named “atmospheric_rivers” or similar
  • Test in a production run and in review
  • Replace environment.yml file accordingly

@charparr charparr merged commit 19292fa into main Sep 11, 2023
@charparr charparr deleted the fix_azimuth branch September 11, 2023 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants