Skip to content

An API server for the Georgetown Infectious Disease Atlas (GIDA) Global Health Security Tracking site (https://tracking.ghscosting.org/)

License

Notifications You must be signed in to change notification settings

cghss-data-lab/ghs-tracking-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ghs-tracking-api

An API server for the Georgetown University Center for Global Health Science and Security (GHSS) International Disease and Events Analysis (IDEA) Global Health Security (GHS) Tracking site (https://tracking.ghscosting.org/)

A list of all relevant web resources for this project follows.

Code organization

A description of the most important modules and packages in ghs-tracking-api follows.

Key modules and packages

  • api. Package containing main API functionality, including defining the routing, API documentation, and functions that retrieve data from the database and return it as API responses.
  • cli. Package containing command line interface (CLI) for data management and ingest. Try python -m ghst --help from within your virtual environment to see help for commands, or follow the GHS Tracking data update brief checklist to make data updates.
  • D-Portal-Tracking. A submodule used in the data ingest process to obtain IATI Registry data; only created if command python -m ghst ingest auto -m iati has been run.
  • db and db_metric. Packages that handle getting a connection to the main COVID AMP database (containing policy data) and the COVID AMP metrics database (containing COVID-19 caseload/death data). Each contains a models.py module that defines the entities and data fields in the databases.
  • ingest. Data ingest code, including packages for ingesting data from certain sources, processing data, tagging data, and more. Most data management can be done through the CLI and you needn't run any code in this package, normally.
  • main.py. The main entrypoint module of the application (see checklist below for how to start it).

Additional modules and packages

  • airtableio. Input/Output boilerplate code for interfacing with Airtable.
  • dataclient. Implemented by package airtableio to support some functionality.
  • docker. Currently unused. WIP code to Dockerize the application.
  • filewalker. Utility for iterating over files within a directory with more customized functionality than path package. Only used by ingest/assistance/acta package.
  • googlesheetio. Input/Output boilerplate code for interfacing with Google Sheets.
  • issuesio. Input/Output for writing issues to the issues PostgreSQL table, which can be used to help adjudicate issues arising during data ingest.
  • logs. Directory where all log files are written (not tracked in version control). A full log is written as well as a warnings-only log that includes any log of level more severe than INFO.
  • research. Primarily ad hoc data analysis packages and modules that are not required for data webservices or data ingest functionality. These are not guaranteed to work as-is and may not be comprehensively documented.
  • services. Package implementing some common data service operations (CRUD) for entities in the GHS Tracking database. Note: some functionality currently in the services/stakeholdersvc is duplicated elsewhere in the code base, but the services/stakeholdersvc should be used, and other packages phased out over time.
  • sql. SQL scripts containing queries or manipulation language that are executed by various Python packages, mostly for data ingest purposes. Note that not all of these SQL files are currently used and some are data definition language for the database that may not be up-to-date with the current data structure.
  • topprojio. Input/Output code for interfacing with the large projects QA/QC matrix. This matrix is manually reviewed during each data update cycle to ensure the highest-valued projects have been QA/QCd.

Getting started

The following instructions assume you are running MacOS.

  1. Open a Terminal window. If you are unsure, follow instructions here.

  2. If you haven't installed pipenv already, do so by following instructions here.

  3. Check if you have installed pyenv by doing the following MacOS command. You should get output describing the version of pyenv that is installed.

    pyenv --version
    
  4. If pyenv was found, skip to the next step. Otherwise, install pyenv if you have not already by doing

    brew install pyenv
    
  5. To install the necessary Python version, do

    pyenv install 3.7.13
    
  6. To set the local Python version, do

    pyenv local 3.7.13
    
  7. Install or upgrade your installation of pipenv. If you need to install it, follow the steps at

    If you installed it with Homebrew, do

    brew upgrade pipenv
    

    If you installed it with pip, do

    pip install --upgrade pipenv
    
  8. To install Python packages, do

    pyenv exec pipenv install --python=3.7.13 --dev
    
  9. To run the GHS Tracking data management setup program, do

    pipenv run python setup.py
    

    The program will ask for an Airtable API key (which you can find at https://airtable.com/account) and your local PostgreSQL connection info.

    You only need to provide the Airtable API key if you will be doing data ingest. You can edit this info later in the file .env.

  10. Clone a copy of the Tracking database locally with command

    pipenv run python -m ghst database clone-from-cloud -u [YOUR_LOCAL_POSTGRESQL_USERNAME] -d tracking -dc ghs-tracking-green
    
  11. Confirm that the database name and username in .env are tracking and your local PostgreSQL username.

  12. To start the API server locally, do

    pipenv run uvicorn main:app --reload
    

About

An API server for the Georgetown Infectious Disease Atlas (GIDA) Global Health Security Tracking site (https://tracking.ghscosting.org/)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •