Test InMAP for multiday run with 1-km CMAQ input #9

pmartien · 2022-08-29T18:12:31Z

Steps to Close

@bkoo-git at BAAQMD team to supply needed modeling files
@yuzhou-wang and @bujinb UW team to run multi-day InMAP test with 1-km CMAQ inputs
@yuzhou-wang and @bujinb at UW team to report back via this GitHub issue any issues? Follow up needed?
@bkoo-git at BAAQMD team to process the WRF and CMAQ data for InMAP for the whole 2018 using the tool/script provided by the UW team and send the processed file to the UW team
@yuzhou-wang and @bujinb at UW team to test the InMAP data processed by the BAAQMD team and report back if there's any issue

yuzhou-wang · 2022-08-29T19:58:01Z

We have finished the test of the multi-day InMAP with 1-km CMAQ, and have sent the scripts.

pmartien · 2022-08-29T20:48:38Z

Thanks, @bkoo-git, @yuzhou-wang. Any issues to report? What are next steps?

bkoo-git · 2022-08-29T21:16:21Z

I've finished testing the scripts prepared by @yuzhou-wang on our cluster machine. I believe the next step will be building InMAP using the preprocessed WRF/CMAQ data. @yuzhou-wang, any guidance?

bkoo-git · 2022-09-09T00:35:14Z

Runtime for preprocessing the 1-km WRF and CMAQ data for InMAP for the whole year of 2018:

The first step was to extract required variables from daily WRF, CMAQ and MCIP files and combine them into a single daily file. This took ~6 hours processing a monthly period on our cluster. To facilitate the process, all 12 months were processed simultaneously running each month on a separate cluster node;
The second step was to convert these daily files into InMAP meteorology and baseline chemistry input data. This step took 3.6 hours for the whole 2018 period on our cluster.

pmartien · 2022-09-14T20:55:22Z

Thanks @bkoo-git for the status update and for the questions about next steps! @yuzhou-wang, @bujinb: can you provide a status update? Do you see any issues with what @bkoo-git provided?
I'm trying to encourage more discussion and updates via GitHub so we can facilitate quick turn around on simple blockers to our collective progress. If this doesn't work I will call for more frequent project Zoom meetings, which I think will be less efficient & productive. :-)

pmartien · 2022-10-03T16:41:18Z

Hi InMAP-SFAB team,
Any progress to report? Updates for the group?
Thanks!

yuzhou-wang · 2022-10-03T20:09:10Z

I'm still working on the testing of the 1km data, and will provide feedbacks by the end of this week.

pmartien · 2022-10-03T21:37:44Z

Great. Thank you @yuzhou-wang ! I look forward to your feedback!

bujinb · 2022-10-05T22:25:49Z

I have run inmap and isrm on the google cloud. Currently working on running multiple inmap runs in parallel on compute engine, but facing issues

bkoo-git · 2022-10-05T22:41:34Z

@bujinb Could you please clarify about the issues? Is there anything wrong with the InMAP data file I processed or you are having issues with running InMAP on the cloud?

bujinb · 2022-10-06T20:10:54Z

@bkoo-git I am primarily working on running inmap on cloud and having issues with running multiple inmap runs in parallel in the cloud. @yuzhou-wang is working with the cmaq outputs

pmartien · 2022-10-06T20:44:02Z

Thanks, @yuzhou-wang, @bujinb! So it sounds like the processed files we handed off to you are okay? But that setting up multiple runs on the Google cloud processors is an issue. Is it an issue specific to InMAP or just running any process on multiple processors is an issue? Thanks for posting updates on GitHub!

bujinb · 2022-10-06T20:49:19Z

@pmartien google engineer thinks it is an inmap issue, but Chris has used kubernetes to run inmap in paralel before so it might not be inmap issue. We are trying to have regular meetings with Chris for help. We'll try to update on github as much as we can. Thanks!

pmartien · 2022-10-06T20:53:43Z

@bujinb: Got it! Let us know if there's any way we can be helpful.

yuzhou-wang · 2022-10-11T05:48:52Z

I tested the new InMAP several time, but ran into a same problem: it generated infinite concentrations using the emissions in San Francisco. I'm still trying to find out the reason. I will post updates when I find our the reason or solved the problem.

pmartien · 2022-10-11T15:50:17Z

Thanks, @yuzhou-wang ! Let us know if there is any indication that the files we provided are causing/contributing to this problem. Are you only seeing the problem when submitting multiple InMAP runs? Or does it also occur with a single run?
Thanks again.

yuzhou-wang · 2022-10-12T06:47:55Z

@pmartien I tried both single and multiple InMAP runs, and used both one-day and whole-year InMAP data, all the tests generate infinite numbers. I looked into the InMAP data and find that there should be some problems with the calculation of dry deposition. Futher tracking to the wrfcmaq data, there are missing values in three wrfcmaq variables (rain water mixing ratio, cloud water mixing ratio, cloud fraction). I'm not sure whether the problem is caused by the wrf data itself, or by my calculation (getting wrfcmaq data from wrf and cmaq). I'll look in to the wrf data and try to find our the reason of the problem in the following days.

yuzhou-wang · 2022-10-13T05:05:03Z

I have figured out the problem: there is mismatch of the wrf layers and wrfcmaq layers. The wrfcmaq verticle layers should start from 0 (ground level), but it started from -1 due to a small error in the python preprocess code. I revised the python code and generated a new one-day inmap, and it runs correctly. So I guess we need to redo the whole year inmap preprocess using the revised code. I'll make more tests to make sure that the revised code generate correct results. I'll send @bkoo-git the updated inmap preprocess code and a detailed guide to run the new inmap this week.

bkoo-git · 2022-10-13T06:53:17Z

@yuzhou-wang Thanks for fixing the error! I'll re-process the wrfcmaq data once I receive the updated code.

pmartien · 2022-10-13T14:29:33Z

Thanks @yuzhou-wang, @bujinb for isolating this problem! And for keeping us updated on github; super helpful!

bujinb · 2022-10-13T14:56:17Z

We are still working on running multiple inmap in parallel on google cloud. Chris's Kubernetes cluster can run 1250 inmap at the same time, we are hoping we could do the same or better. I heard from Yuzhou that your cluster is fast; I was wondering if we can run inmap in parallel on your cluster @bkoo-git Can we schedule a quick meeting?
Thanks
Bujin

bkoo-git · 2022-10-14T16:35:04Z

Status update:

@yuzhou-wang fixed the error and sent me the updated preprocessor code. I will re-do the InMAP preprocessing with the updated code and send her the new InMAP data for verification next week.
@bujinb and I arranged a meeting on Monday (Oct. 17, 2PM) to discuss about running InMAP in parallel on the District cluster. Jeff Matsuoka will join the meeting. Let us know if anyone else wants to join.

pmartien · 2022-10-14T16:41:31Z

Great work, all. Thanks for the updates @bkoo-git!

yuzhou-wang · 2022-10-14T17:39:30Z

@bkoo-git @bujinb I'd also like to join the meeting about the running InMAP in parallel. Can you send me the link? Thanks!

bujinb · 2022-10-17T22:56:55Z

@bkoo-git ,@yuzhou-wang, Jeff and I had our meeting on running inmap on your local cluster. Seems like running inmap on the cloud will be the faster way to generate the new ISRM as we can potentially run a thousand inmap run in parallel once we learn how to utilize kubernetes. We have sent instructions of running inmap on a local computer (in the google drive). @bkoo-git please update us when you try running it on your cluster.

bkoo-git · 2022-10-18T02:50:34Z

Status update:

CMAQ and WRF data for 2018 were re-processed with the updated preprocessor code, and the new InMAP data file produced was sent to @yuzhou-wang for verification.
Jeff and I will work on running InMAP on the District cluster with the instructions prepared by @yuzhou-wang.

pmartien · 2022-10-18T15:05:12Z

Thanks for the status updates, @bujinb and @bkoo-git. This sounds like good progress.

yuzhou-wang · 2022-10-21T18:04:07Z

I've tested the new InMAP using the year-2016 egu emissions in San Francisco. The concentration estimations at 1km resolution are higher (2 to 3 times higher) than the estimation from the national InMAP, but looks still reasonable. I'll make more test using other emission files.

bujinb · 2022-11-07T16:32:53Z

@bkoo-git In case you need help running Inmap on your local cluster @yuzhou-wang and I are available.
Update on running inmap on cloud: We are still trying to troubleshoot our attempts at running inmap on kubernetes engine.

bkoo-git · 2022-11-07T17:17:31Z

Thanks, @bujinb! I did test the 2005 NEI test case (from the InMAP release page) on the District cluster, and the results look reasonable. However, I believe a better test would be to reproduce the results of the Bay Area test case @yuzhou-wang did using the InMAP data file created from the 2018 CMAQ/WRF data. I've asked @yuzhou-wang for the input files she used for her test, and received the files today. I will try to replicate her test case on our cluster this week and report back to you guys~

pmartien · 2022-11-07T22:17:14Z

Thanks again all for the status update. Much appreciated!

bkoo-git · 2022-11-10T22:55:03Z

A quick update:
I successfully ran @yuzhou-wang's SF test case on the District cluster and verified that my results and hers are identical. She said the test run took 1.5 hours on her lab computer. It took 44 minutes on the District machine (soma). As I believe the InMAP code is not threaded, I think the runtime difference simply reflects the clock speed difference between the processors used. Also, note that the test case doesn't include the full set of emissions used in our 2018 base case CMAQ simulation.

pmartien · 2022-11-10T23:42:39Z

Great news, @bkoo-git. What should our next steps be? Should we meet to discuss?

bkoo-git · 2022-11-11T00:03:21Z

I think now might be a good time for another meeting to get everyone on the same page and discuss the next step.
I wonder if the InMAP results (if all emissions are included) naturally match our annual CMAQ results since we built the InMAP baseline chemistry input data using the full 2018 CMAQ outputs. I'd like hear from the InMAP developers on this.
If we still need to evaluate how well InMAP replicates the annual CMAQ results, we'd need to develop InMAP emission inputs that are consistent with the 2018 CMAQ emissions inputs, re-run InMAP, and compare the InMAP results with our CMAQ results.

yuzhou-wang · 2022-11-11T19:00:53Z

I think this comparison it's important. Although the new InMAP was built on the CMAQ, the annual predictions can still be slightly different since InMAP is linear.

We can discuss the emission inputs needed by InMAP. @bkoo-git Do you have the emission inputs that are align to the CMAQ grids (1km or 4km)?

bkoo-git · 2022-11-14T07:05:42Z

We have discussed about the emissions input formats in Issue #2 and determined that the SMOKE-formatted files (such as ORL or FF10) would be the easiest way if we want to retain the source info (e.g., SCC). @yuzhou-wang, do you have a sample test case that uses SMOKE-formatted emissions input files?

yuzhou-wang · 2022-11-15T19:43:28Z

I've made a comparison between the national InMAP and the new InMAP, using all the NEI 2016 all point emissions in the Bay Area. I've attached the comparison slides. I compared the results at both 1km and 10km spatial resolutions. It seems that the mean value of Total PM2.5 predictions from the new InMAP for the whole domain is around 2-3 times of the national InMAP predictions. The biggest difference is in SO2 pollutant, for which the new InMAP has much higher concentration predictions than the national InMAP. I've also looked at the total SO2 emissions in California, and find that the SO2 emissions dropped an order of magnitude from 2005 to 2018 (160 ton/year to 20 ton/year). The great changes of SO2 emissions may cause the sensitivity changes of SO2 to the PSO4.

The good thing is that from the 1km resolution comparisons, the predictions from the new InMAP seems more precise. It also seems to capture the emission sources well.

We plan to more comparisons of the new InMAP to CMAQ, and new InMAP to monitoring concentrations to see how well the new InMAP perform.
comparison_inmap.pptx

yuzhou-wang · 2022-11-15T19:45:39Z

@bkoo-git Could you send me a sample of SMOKE-formatted emissions input files? I'd like to make some test runs using that format. I don't have a sample SMOKE-formatted emissions input handy.

bkoo-git · 2022-11-15T20:22:10Z

Thanks @yuzhou-wang for sharing your comparison results. @stephenreid65 can provide you with sample SMOKE-formatted emissions input files.
I have a question: Can you use different emissions input formats in a single run? For example, can you list a SMOKE-formatted emissions input for a source category and a shapefile for another category in the same TOML?

yuzhou-wang · 2022-11-15T22:47:07Z

@bkoo-git I'm not sure about it. I'll make some test runs including both shapefile and SMOKE-formatted emissions. I guess the default InMAP configuration only take shapefile. We may need to make some preprocess to convert the SMOKE-formatted emissions to shapefile.

bkoo-git · 2022-11-15T23:24:13Z

I was asking because not all emissions are generated by SMOKE. Sea spray emissions are internally generated by CMAQ at runtime: they can be made available via diagnostic outputs in a netCDF format, which could be converted to a shapefile, but formatting them into a SMOKE inventory file wouldn't be desirable.

bkoo-git · 2022-11-15T23:41:02Z

@yuzhou-wang If we have to convert the SMOKE-formatted emissions to shapefiles, wouldn't we lose source info in the process? Then, what's the purpose of using SMOKE-formatted emissions? I notice that your test case emission inputs don't retain source info like SCC. What's the reason why we want to keep source info like SCC in the emissions input?

pmartien · 2022-11-16T00:44:18Z

Thanks for sharing the comparison slide deck, @yuzhou-wang. That's very interesting. @bkoo-git, are we seeing high PSO4 levels in CMAQ runs?

bkoo-git · 2022-11-16T01:11:17Z

Annual average PSO4 predicted by CMAQ can be high near high SO2-emitting sources, but the max was ~10 μg/m³. Peak PSO4 predicted by InMAP appears to be much higher than what CMAQ predicted even though the InMAP run includes point source emissions only.

stephenreid65 · 2022-11-16T01:25:50Z

@yuzhou-wang, I can provide SMOKE-ready emissions inputs, but they would basically be CSV files with annual emissions by county or facility. I think you would need something gridded, so would our spatial surrogates also be required? We don't have emissions in shapefile format right now.

yuzhou-wang · 2022-11-18T06:44:28Z

@bkoo-git @stephenreid65 I guess that since the comparison is mostly to make sure that the new InMAP provide the reasonable prediction. So we may not need the emissions with detailed source info. I think we can use a combined emission file if it's available. Or if you have CMAQ prediction from a single source, I can also run the new InMAP using the single emission file. Do you have any suggestions on that?

bkoo-git · 2022-11-18T23:33:58Z

We have discovered that VOC mappings in the wrfcmaq2inmap preprocessor wasn't updated for the SAPRC07 chemical mechanism which was used in our Bay Area CMAQ modeling, thus many VOC species were dropped from the process. So, we need to re-do the preprocessing. Since we are running a new 2018 base case CMAQ simulation at the moment, I propose preparing the InMAP input data using the new simulation outputs. The new simulation will also generate additional diagnostic outputs for sea spray emissions, which can be used later for evaluating InMAP. Meanwhile, I will work with @yuzhou-wang to fix the VOC mappings in the preprocessor. Let me know if any comments/suggestions/questions.

bujinb · 2022-11-29T19:36:42Z

We have successfully built the kubernetes needed for running inmap on the cloud in parallel to make a new ISRM, but still in the process of testing the command. Meanwhile, I have run the new inmap on several locations and made example test results. Please provide suggestions.
example test results inmap.pptx

pmartien · 2022-12-03T22:53:26Z

Hi @bujinb, @yuzhou-wang, and all. Thanks for the update and for sharing these test runs. Following up on earlier comments on this issue, I think it may be a good time to schedule a meeting to discuss next steps. I'll follow up with an email with some suggested dates.

bujinb · 2023-01-11T22:05:57Z

We have successfully ran and made small (16 grid cells) isrm for testing purposes on google cloud. Now we are testing bigger runs with more grid cells to get the idea of how long and how much money will the process take.
Before we have the meeting in 2 weeks, do you have any suggestions for the example test results I posted above? We will shift to 2020 census data soon.
Thanks Bujin

pmartien assigned dholstius, pmartien, yuzhou-wang, bkoo-git and stephenreid65 Aug 29, 2022

yuzhou-wang closed this as completed Aug 29, 2022

bkoo-git reopened this Sep 9, 2022

Test InMAP for multiday run with 1-km CMAQ input #9

Test InMAP for multiday run with 1-km CMAQ input #9

Comments

pmartien commented Aug 29, 2022 • edited by bkoo-git Loading

Steps to Close

yuzhou-wang commented Aug 29, 2022

pmartien commented Aug 29, 2022

bkoo-git commented Aug 29, 2022

bkoo-git commented Sep 9, 2022

pmartien commented Sep 14, 2022 • edited Loading

pmartien commented Oct 3, 2022

yuzhou-wang commented Oct 3, 2022

pmartien commented Oct 3, 2022

bujinb commented Oct 5, 2022

bkoo-git commented Oct 5, 2022

bujinb commented Oct 6, 2022

pmartien commented Oct 6, 2022

bujinb commented Oct 6, 2022

pmartien commented Oct 6, 2022

yuzhou-wang commented Oct 11, 2022

pmartien commented Oct 11, 2022

yuzhou-wang commented Oct 12, 2022

yuzhou-wang commented Oct 13, 2022

bkoo-git commented Oct 13, 2022

pmartien commented Oct 13, 2022

bujinb commented Oct 13, 2022

bkoo-git commented Oct 14, 2022

pmartien commented Oct 14, 2022

yuzhou-wang commented Oct 14, 2022

bujinb commented Oct 17, 2022

bkoo-git commented Oct 18, 2022

pmartien commented Oct 18, 2022

yuzhou-wang commented Oct 21, 2022

bujinb commented Nov 7, 2022

bkoo-git commented Nov 7, 2022

pmartien commented Nov 7, 2022

bkoo-git commented Nov 10, 2022

pmartien commented Nov 10, 2022

bkoo-git commented Nov 11, 2022

yuzhou-wang commented Nov 11, 2022

bkoo-git commented Nov 14, 2022

yuzhou-wang commented Nov 15, 2022

yuzhou-wang commented Nov 15, 2022

bkoo-git commented Nov 15, 2022

yuzhou-wang commented Nov 15, 2022

bkoo-git commented Nov 15, 2022

bkoo-git commented Nov 15, 2022

pmartien commented Nov 16, 2022 • edited Loading

bkoo-git commented Nov 16, 2022

stephenreid65 commented Nov 16, 2022

yuzhou-wang commented Nov 18, 2022

bkoo-git commented Nov 18, 2022

bujinb commented Nov 29, 2022

pmartien commented Dec 3, 2022

bujinb commented Jan 11, 2023

pmartien commented Aug 29, 2022 •

edited by bkoo-git

Loading

pmartien commented Sep 14, 2022 •

edited

Loading

pmartien commented Nov 16, 2022 •

edited

Loading