Note: In the checklist below the database name tracking-v9
is used. You should change v9
to whatever the version of the Tracking dataset you are creating is. Version numbers simply increment by one with each release.
-
Complete all steps EXCEPT the last step in the main
README.md
of this repository -
Open the
.env
file created by runningsetup.py
as described in the mainREADME.md
. Change the value of the environment variable nameddatabase
to correspond to the version of the dataset you are creating, e.g.,tracking-v9
. Do the same for variabledatabase
in filesettings.ini
. -
Run
pipenv run python -m ghst database clone-from-cloud -u [YOUR_LOCAL_POSTGRESQL_USERNAME] -d tracking-v9 -dc ghs-tracking-wip
-
Run the following command to start all automated ingest modules. Note: this step may take several hours to complete.
pipenv run python -m ghst ingest auto --all
-
Run the following SQL statement in local database (DBeaver) and identify any issues, such as missing stakeholders, that must be resolved. Fix them according to instructions in the issue log, mark as resolved, and then repeat the ingest using the command in the previous step.
select * from issues where data_version = '9' and not resolved_in_prior_version;
-
Run to ingest data from the manual data entry spreadsheet
pipenv run python -m ghst ingest main
-
If terminal has errors about missing stakeholders, fix using the same SQL statement as above. Repeat ingest once complete.
-
If errors about missing/duplicate stakeholders, review stakeholders in cloud; merge/fix as needed, described in SOP here: https://docs.google.com/document/d/1lmWGn7i5gMh3SIbjb_K9eoYQsXU0XCrvb86IFcQPMN0/edit#heading=h.joe4wv5wq2n
-
Once automated ingest modules and main ingest have comppleted, review Ebola PHEICs in cloud, here: https://airtable.com/appN2NJHOK17QtLJ4/tblrDVlxW90Xzmksn/viwXhHrn3MlLLrVep?blocks=hide. Reference SOP for instructions on adjudicating Ebola PHEICs: https://docs.google.com/document/d/1lmWGn7i5gMh3SIbjb_K9eoYQsXU0XCrvb86IFcQPMN0/edit#heading=h.7mwbai7y832q
-
Run the following SQL statement in local database (DBeaver) to manually resolve a known error where project value was incorrectly reported as $200B instead of $200M. Edit value and save change in local database.
select f.* from projects p join flows f on f.project_id = p.id where p."name" = 'COVID-19 Response Support Program Loan';
-
Once all high priority issues have been resolved, run
pipenv run python -m ghst database refresh-all
-
Run to sync large projects in need of review to Airtable
pipenv run python -m ghst review large-projects sync-to-cloud
-
Review all unreviewed large projects in cloud here: https://airtable.com/appN2NJHOK17QtLJ4/tblC2lJnAjmKYyq8o/viwRCSuHoZi2losQt
-
After completing review of large projects, run the following to apply revisions
pipenv run python -m ghst review large-projects apply-revisions
-
Run
pipenv run python -m ghst database refresh-all
-
Run
pipenv run python -m ghst review conflicts check --all
(it may take a while) and then adjudicate potential conflicts in Airtable here: https://airtable.com/appN2NJHOK17QtLJ4/tbl0f65klqmKi6tVP/viw4r2wgr2fALzKSt
-
After completing revisions, run to apply
pipenv run python -m ghst review conflicts apply-revisions
-
Continue with the checklist Publishing data updates