Skip to content
Milan Malfait edited this page Oct 17, 2024 · 2 revisions

Dummy data for development

The dummy data can be found in inst/dev_data. These data have been generated by using the synthetic dataset synthea-allergies-10k, and adding some dummy data for the MEASUREMENT and OBSERVATION tables (to have some records in the 'calypso-summary-stats' table).

The csv files in inst/dev_data can be regenerated by running

source(here::here("scripts/create_dev_data.R"))

The files are under version control, so re-running this script is only necessary when the dummy data need to be updated. Pulling the latest changes from the repository should already have the up-to-date files, so it's not necessary to regenerate them to run the app locally.

Test data to test production

To mimic how the app runs in production, we also have larger test data, stored as parquet files in data/test_data/. These files can be reproduced by running the following scripts

source(here::here("scripts/01_setup_test_db.R"))
source(here::here("scripts/02_insert_dummy_tables.R"))
source(here::here("scripts/03_analyse_omop_cdm.R"))

Again, the test data are part of the repository, so regenerating them should only be done when the data needs to be updated.

Production data

The data used to run the app in production should go into data/prod_data/. We require the following files to be present:

data/prod_data
├── omopcat_concepts.parquet
├── omopcat_monthly_counts.parquet
└── omopcat_summary_stats.parquet

The scripts/create_prod_data.R script can be used to generate these files, given a configured database connection with an OMOP extract.

Clone this wiki locally