Skip to content

Latest commit

 

History

History
67 lines (54 loc) · 3.64 KB

GHS Tracking data update brief checklist.md

File metadata and controls

67 lines (54 loc) · 3.64 KB

GHS Tracking data update brief checklist

Note: In the checklist below the database name tracking-v9 is used. You should change v9 to whatever the version of the Tracking dataset you are creating is. Version numbers simply increment by one with each release.

  1. Complete all steps EXCEPT the last step in the main README.md of this repository

  2. Open the .env file created by running setup.py as described in the main README.md. Change the value of the environment variable named database to correspond to the version of the dataset you are creating, e.g., tracking-v9. Do the same for variable database in file settings.ini.

  3. Run

    pipenv run python -m ghst database clone-from-cloud -u [YOUR_LOCAL_POSTGRESQL_USERNAME] -d tracking-v9 -dc ghs-tracking-wip
    
  4. Run the following command to start all automated ingest modules. Note: this step may take several hours to complete.

    pipenv run python -m ghst ingest auto --all
    
  5. Run the following SQL statement in local database (DBeaver) and identify any issues, such as missing stakeholders, that must be resolved. Fix them according to instructions in the issue log, mark as resolved, and then repeat the ingest using the command in the previous step.

    select * from issues where data_version = '9' and not resolved_in_prior_version;
    
  6. Run to ingest data from the manual data entry spreadsheet

    pipenv run python -m ghst ingest main
    
  7. If terminal has errors about missing stakeholders, fix using the same SQL statement as above. Repeat ingest once complete.

  8. If errors about missing/duplicate stakeholders, review stakeholders in cloud; merge/fix as needed, described in SOP here: https://docs.google.com/document/d/1lmWGn7i5gMh3SIbjb_K9eoYQsXU0XCrvb86IFcQPMN0/edit#heading=h.joe4wv5wq2n

  9. Once automated ingest modules and main ingest have comppleted, review Ebola PHEICs in cloud, here: https://airtable.com/appN2NJHOK17QtLJ4/tblrDVlxW90Xzmksn/viwXhHrn3MlLLrVep?blocks=hide. Reference SOP for instructions on adjudicating Ebola PHEICs: https://docs.google.com/document/d/1lmWGn7i5gMh3SIbjb_K9eoYQsXU0XCrvb86IFcQPMN0/edit#heading=h.7mwbai7y832q

  10. Run the following SQL statement in local database (DBeaver) to manually resolve a known error where project value was incorrectly reported as $200B instead of $200M. Edit value and save change in local database.

    select f.* from projects p join flows f on f.project_id = p.id where p."name" = 'COVID-19 Response Support Program Loan';
    
  11. Once all high priority issues have been resolved, run

    pipenv run python -m ghst database refresh-all
    
  12. Run to sync large projects in need of review to Airtable

    pipenv run python -m ghst review large-projects sync-to-cloud
    
  13. Review all unreviewed large projects in cloud here: https://airtable.com/appN2NJHOK17QtLJ4/tblC2lJnAjmKYyq8o/viwRCSuHoZi2losQt

  14. After completing review of large projects, run the following to apply revisions

    pipenv run python -m ghst review large-projects apply-revisions
    
  15. Run

    pipenv run python -m ghst database refresh-all
    
  16. Run

    pipenv run python -m ghst review conflicts check --all
    

    (it may take a while) and then adjudicate potential conflicts in Airtable here: https://airtable.com/appN2NJHOK17QtLJ4/tbl0f65klqmKi6tVP/viw4r2wgr2fALzKSt

  17. After completing revisions, run to apply

    pipenv run python -m ghst review conflicts apply-revisions
    
  18. Continue with the checklist Publishing data updates