fix: handle empty HRRR data files via linear imputation #245

danielolsen · 2021-12-06T22:47:26Z

Purpose

When generating wind power profiles from HRRR data, gracefully handle any missing data via linear interpolation. Closes #244.

What the code is doing

The impute module is moved from prereise.gather.winddata.rap.impute to prereise.gather.winddata.impute, and a new linear interpolation method is added which should perform well on small data gaps.

Within prereise.gather.winddata.hrrr.calculations, calculate_pout is refactored to first build an array of all wind speed magnitudes obtained from the NOAA grib files (filling in NA when files are empty), then impute missing values as necessary, and finally convert wind speeds to wind powers.

Testing

Unit tests still pass, and this has been tested end-to-end when generating 2020 wind power profiles for the HIFLD grid (see #227 (comment)). When downloading the 2020 data, there were four files which downloaded empty, even after several attempts, suggesting that the data are missing from the NOAA server.

Usage Example/Visuals

from datetime import datetime
from powersimdata import Grid
from prereise.gather.winddata.hrrr.hrrr import retrieve_data
from prereise.gather.winddata.hrrr.calculations import calculate_pout

start_dt = datetime.fromisoformat("2020-01-01")
end_dt = datetime.fromisoformat("2021-01-01")
directory = "./"

grid = Grid("USA", "hifld")
wind_farms = grid.plant.query("type == 'wind' or type == 'wind_offshore'").copy()
wind_farms["state_abv"] = wind_farms.zone_id.map(grid.model_immutables.zones["id2abv"])
retrieve_data(start_dt=start_dt, end_dt=end_dt, directory=directory)
df = calculate_pout(wind_farms=wind_farms, start_dt=start_dt, end_dt=end_dt, directory=directory)

Time estimate

15-30 minutes.

jenhagg

Makes sense

prereise/gather/winddata/impute.py

rouille · 2021-12-07T06:20:21Z

prereise/gather/winddata/hrrr/calculations.py

+            )
+            for j in range(wind_farm_ct)
+        ]
+        for i, _ in tqdm(enumerate(dts))


Do we need the enumerate here? It seems we just need the length of the dts

enumerate() vs. range(len()), which is more pythonic?

for i in tqdm(dts)?

@jon-hagg I think we will need the index rather than the element of dts here.

tqdm(dts, total=len(dts)) would work, but not sure if that's better than enumerate/range. It's unclear to me if any is the most pythonic, but I think enumerate is fine

We could also refactor how we build wind_speed_data--as a dataframe instead of a numpy array--and then we could build wind_power_data using apply calls instead of list comprehensions.

See #221 (comment).

Aha. Maybe I'll leave this alone for now then.

EDIT: Or maybe the pandas version can be made a little more transparent by instantiating the dataframe with index= and columns=...

Done. I tried to strike a good balance between compactness and readability.

prereise/gather/winddata/impute.py

BainanXia

Good catch. Thanks!

chore: move impute module one level up

6803270

danielolsen requested review from rouille and BainanXia December 6, 2021 22:47

danielolsen self-assigned this Dec 6, 2021

danielolsen mentioned this pull request Dec 6, 2021

Add scripts to generate profiles for HIFLD grid #227

Open

1 task

danielolsen requested a review from jenhagg December 6, 2021 22:52

jenhagg approved these changes Dec 7, 2021

View reviewed changes

BainanXia reviewed Dec 7, 2021

View reviewed changes

prereise/gather/winddata/impute.py Outdated Show resolved Hide resolved

rouille reviewed Dec 7, 2021

View reviewed changes

BainanXia reviewed Dec 7, 2021

View reviewed changes

prereise/gather/winddata/impute.py Show resolved Hide resolved

BainanXia approved these changes Dec 7, 2021

View reviewed changes

danielolsen force-pushed the daniel/hrrr_gap_tolerant branch 2 times, most recently from 6817900 to ab47e24 Compare December 7, 2021 22:52

danielolsen added 2 commits December 8, 2021 12:00

feat: add linear imputation method

889cffd

fix: add try/except when querying from grib files, use linear imputation

535b4d9

danielolsen force-pushed the daniel/hrrr_gap_tolerant branch from ab47e24 to 535b4d9 Compare December 8, 2021 20:00

danielolsen merged commit 3ce5915 into develop Dec 8, 2021

danielolsen deleted the daniel/hrrr_gap_tolerant branch December 8, 2021 20:07

danielolsen mentioned this pull request Jan 5, 2022

chore: merge develop into master for v0.4.1 release #252

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle empty HRRR data files via linear imputation #245

fix: handle empty HRRR data files via linear imputation #245

danielolsen commented Dec 6, 2021

jenhagg left a comment

rouille Dec 7, 2021

danielolsen Dec 7, 2021

jenhagg Dec 7, 2021

BainanXia Dec 7, 2021 •

edited

Loading

jenhagg Dec 7, 2021

danielolsen Dec 7, 2021

rouille Dec 7, 2021

danielolsen Dec 7, 2021 •

edited

Loading

danielolsen Dec 8, 2021

BainanXia left a comment

fix: handle empty HRRR data files via linear imputation #245

fix: handle empty HRRR data files via linear imputation #245

Conversation

danielolsen commented Dec 6, 2021

Purpose

What the code is doing

Testing

Usage Example/Visuals

Time estimate

jenhagg left a comment

Choose a reason for hiding this comment

rouille Dec 7, 2021

Choose a reason for hiding this comment

danielolsen Dec 7, 2021

Choose a reason for hiding this comment

jenhagg Dec 7, 2021

Choose a reason for hiding this comment

BainanXia Dec 7, 2021 • edited Loading

Choose a reason for hiding this comment

jenhagg Dec 7, 2021

Choose a reason for hiding this comment

danielolsen Dec 7, 2021

Choose a reason for hiding this comment

rouille Dec 7, 2021

Choose a reason for hiding this comment

danielolsen Dec 7, 2021 • edited Loading

Choose a reason for hiding this comment

danielolsen Dec 8, 2021

Choose a reason for hiding this comment

BainanXia left a comment

Choose a reason for hiding this comment

BainanXia Dec 7, 2021 •

edited

Loading

danielolsen Dec 7, 2021 •

edited

Loading