Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more flexibility for loading EIA/EPA data when building HIFLD grid #246

Open
1 task
danielolsen opened this issue Dec 13, 2021 · 0 comments
Open
1 task
Assignees
Labels
feature request Request for a new feature. (Only lives in Backlog) hifld Related to ingestion of the HIFLD data

Comments

@danielolsen
Copy link
Contributor

🚀

  • Is your feature request essential for your project?

Describe the workflow you want to enable

Currently, we're loading EIA & EPA data for a specific year from CSVs on our blob storage, which were either downloaded as zip files (EPA AMPD) or created manually from xlsx files (EIA Form 860). I with that there were more user-accessible flexibility, in terms of being able to download data for a given year (or a given month, since EIA has monthly Form 860M releases inbetween the annual Form 860 release), and in terms of being able to obtain data from different sources (e.g. from a local copy of Catalyst Cooperative's PUDL database, or their web API).

Describe your proposed implementation

Additional functions could be added to prereise.gather.griddata.hifld.data_access.load which could read data from different sources, and additional parameters could be added to the highest-level functions within prereise.gather.griddata.hifld.data_process to specify which data sources to read from at the start of processing, and these same parameters could be added to prereise.gather.griddata.hifld.orchestration.create_csvs and then passed through to the data_process functions.

Additional context

Catalyst Cooperative currently has a subset of the data we need available via a Datasette interface (which can be read directly from pandas), but not the full dataset, which is available as a sqlite database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature. (Only lives in Backlog) hifld Related to ingestion of the HIFLD data
Projects
None yet
Development

No branches or pull requests

5 participants