This directory contains the code to power the online FinRegistry-FinnGen Risteys web portal. For the data pipeline, please check the pipeline directory.
The project is based on the Phoenix web framework. This framework is written in elixir, which is also used to do data pre-processing and import data into the database. The front-end interactions are using Phoenix LiveView.
-
PostgreSQL database
Some configuration files contain secret credentials, therefore are not present in this public GitHub repository. Risteys developers can get them by requesting access to the GitLab repository risteys/risteys_secrets
.
Make sure you are in the directory risteys_elixir
, then run the command
mix phx.server
You can now access Risteys at http://localhost:4000
Once the Risteys web server is running we still need to put data there.
If you need to compute the statistics by yourself, then check the pipeline documentation.
When you have the data, you can proceed to import it into the Risteys database.
The scripts to import data files into the Risteys database are in this current directory (risteys_elixir
) and they should be run from that directory, for example:
mix run import_icd9.exs <path-to-input-file>
Import scripts should be run in this order:
-
import_icd10.exs, using:
-
import_icd9.exs, using:
-
import_endpoint_csv.exs, using:
-
import_intermediate_counts.exs, using:
-
FG intermediate counts and
FG
as dataset argument.
-
-
import_ontology.exs, using:
-
import_excluded_endpoints.exs, using:
-
import_key_figures.exs, using:
-
Risteys FinRegistry key figures of full population and
FR
as dataset argument
-
-
import_key_figures.exs, using:
-
Risteys FinRegistry key figures of index-persons and
FR_index
as dataset argument
-
-
import_key_figures.exs, using:
-
Risteys FinnGen key figures and
FG
as dataset argument
-
-
import_distributions.exs, using:
-
Risteys FinRegistry age distributions and
age
andFR
as arguments
-
-
import_distributions.exs, using:
-
Risteys FinRegistry year distributions and
year
andFR
as arguments
-
-
import_distributions.exs, using:
-
Risteys FinnGen age distributions and
age
andFG
as arguments
-
-
import_distributions.exs, using:
-
Risteys Finngen year distributions and
year
andFG
as arguments
-
-
import_stats_cumulative_incidence.exs, using:
-
Risteys FinRegistry cumulative incidence and
FR
as argument
-
-
import_stats_cumulative_incidence.exs, using:
-
Risteys FinnGen cumulative incidence and
FG
as argument
-
-
import_interactive_mortality_baseline.exs, using:
-
import_interactive_mortality_params.exs, using:
-
import_mortality_counts.exs, using:
-
import_correlation.exs, using:
-
import_case_overlaps_fr.exs, using:
-
import_coxhr.exs, using
-
import_genetic_correlations.exs, using
-
import_genes.exs, using
-
import_upset_plots.exs, using
-
Upset Plots and
FG
as dataset argument
-
-
import_codewas.exs, using
-
Risteys.LabWAS.import_stats
, using: -
Risteys.OMOP.import_lab_test_loinc_concepts
, using: -
Risteys.LabTestStats.import_dataset_metadata
, using: -
Risteys.LabTestStats.import_stats_npeople
, using: -
Risteys.LabTestStats.import_stats_median_n_measurements
, using: -
Risteys.LabTestStats.import_stats_people_with_two_plus_records
, using: -
Risteys.LabTestStats.import_stats_median_years_first_to_last
, using: -
Risteys.LabTestStats.import_qc_tables
, using: -
Risteys.LabTestStats.import_stats_distribution_lab_values
, using: -
Risteys.LabTestStats.import_stats_distribution_year_of_birth
, using: -
Risteys.LabTestStats.import_stats_distribution_age_first_measurement
, using: -
Risteys.LabTestStats.import_stats_distribution_age_last_measurement
, using: -
Risteys.LabTestStats.import_stats_distribution_age_start_of_registry
, using: -
Risteys.LabTestStats.import_stats_distribution_duration_first_to_last_measurement
, using: -
Risteys.LabTestStats.import_stats_distribution_n_measurement_over_years
, using: -
Risteys.LabTestStats.import_stats_distribution_n_measurements_per_person
, using: -
Risteys.LabTestStats.import_stats_distribution_value_range_per_person
, using:
-
-
name:
ICD10_koodistopalvelu_2015-08_26_utf8__XXH64_71956a051f960e51.csv
-
source: Kela Kansallinen koodistopalvelu
-
-
-
name:
finngen_R9_medcode_ref__XXH64_708053b379a04020.tsv
-
source: FinnGen - Library Green
-
-
-
name:
finngen_R12_endpoint_core_noncore_1.0.added_omit2__XXH64_399efaa48ca282b6.csv
-
source: Merging of
finngen_R12_endpoint_core_noncore_1.0.xlsx
andOMIT
column fromEndpoints_Controls_FINNGEN_ENDPOINTS_DF12_Final_2023-05-17.xlsx
- FinnGen clinical team - GitHub
-
-
-
name:
FINNGEN_ENDPOINTS_DF12_Final_2023-05-17.names_tagged_ordered__XXH64_8264f1235f3f7221.txt
-
source: FinnGen clinical team - GitHub
-
-
-
name:
TAGLIST_DF12__XXH64_2c6dae042382fea9.csv
-
source: FinnGen clinical team - GitHub
-
-
FinnGen endpoint selected core
-
name:
finngen_correlation_clusters_DF8__XXH64_0d9f3a10306791f5.csv
-
source: FinnGen clinical team
-
-
FinnGen endpoint intermediate counts
-
name:
finngen_endpoints_intermediate_counts_green_export_R12_v1__XXH64_85e199bb39d62337.txt
-
source: FinnGen registry team
-
-
-
name:
finngen_ontology_2022-08-22__XXH64_2a8d4690fa4ae89a.json
-
source: Risteys pipeline
-
-
Risteys corrected endpoint description
-
name:
corrected-endpoint-descriptions.airtable-export.2023-10-10__XXH64_b0bd5eb161441ba9.csv
-
source: Risteys Airtable
-
-
FinRegistry excluded endpoints
-
name:
excluded_endpoints_FR_Risteys_R12__XXH64_508917188be68559.csv
-
source: Risteys script
exclude_endpoints_finregistry.py
-
-
Risteys FinRegistry key figures, all individuals
-
name:
key_figures_all_2022-10-10_with_EXALLC_EXMORE__XXH64_920b310de04e72e7.csv
-
source: Risteys pipeline
-
-
Risteys FinRegistry key figures, only index-persons
-
name:
key_figures_index_2022-10-10_with_EXALLC_EXMORE__XXH64_c62d6a466a0512a1.csv
-
source: Risteys pipeline
-
-
-
name:
key_figures_all_2023-09-20__XXH64_93a5ba6f09958693.csv
-
source: Risteys pipeline
-
-
Risteys FinRegistry age distributions
-
name:
distribution_age_2022-10-10_with_EXALLC_EXMORE__XXH64_edd7be5c03a84317.csv
-
source: Risteys pipeline
-
-
Risteys FinRegistry year distributions
-
name:
distribution_year_2022-10-10_with_EXALLC_EXMORE__XXH64_a5ea390cd797b6e3.csv
-
source: Risteys pipeline
-
-
Risteys FinnGen age distributions
-
name:
distribution_age_2023-09-20__XXH64_0ab1f53d7d3013f7.csv
-
source: Risteys pipeline
-
-
Risteys FinnGen year distributions
-
name:
distribution_year_2023-09-20__XXH64_90ee66ed48dfb5fb.csv
-
source: Risteys pipeline
-
-
Risteys FinRegistry cumulative incidence
-
name:
cumulative_incidence_2022-10-10_with_EXALLC_EXMORE__XXH64_c08ae173edf55e72.csv
-
source: Risteys pipeline
-
-
Risteys FinnGen cumulative incidence
-
name:
all_cumulative_incidence__r12__2023-09-20__XXH64_f4909d1f5b2565ee.csv
-
source: Risteys pipeline
-
-
Risteys FinRegistry mortality baseline cumulative hazards
-
name:
mortality_baseline_cumulative_hazard_2022-10-11_with_EXALLC_EXMORE__XXH64_0088608aa7e021bd.csv
-
source: Risteys pipeline
-
-
Risteys FinRegistry mortality parameters
-
name:
mortality_params_2022-10-11_with_EXALLC_EXMORE__XXH64_8f4fdc15e1c061c1.csv
-
source: Risteys pipeline
-
-
Risteys FinRegistry mortality counts
-
name:
mortality_counts_2022-10-11_with_EXALLC_EXMORE__XXH64_f7f9581772ec80c6.csv
-
source: Risteys pipeline
-
-
FinnGen phenotypic + genotypic correlations
-
name:
corr_pheno-fg-r12.0_geno-fg-r12.0_full-join__2023-11-15__XXH64_86c9c7a833d663e5.csv.zst
-
source: FinnGen correlation pipeline for the phenotypic file, merged with genotypic correlation file from FinnGen analysis team
-
-
-
name:
r12.autoreport.compare.keep_cs.r2_0.8.pval_5e_8.variants__XXH64_854d4aeb62e9664a.csv
-
source: FinnGen analysis team
-
-
-
name:
case_overlap_2022-12-31__XXH64_fb1ca5ba80e4a0ba.csv.zst
-
source: Risteys pipeline
-
-
FinRegistry survival analysis results
-
name:
surv_priority_endpoints_2022-12-25__XXH64_b92220411f705ef2.csv
-
source: Risteys pipeline
-
-
-
name:
finngen_R12_FIN.ldsc.summary__XXH64_40dc9830272f8976.tsv
-
source: FinnGen Green library
-
-
-
name:
havana__XXH64_085a38684d85191e.json
-
source: HAVANA through FinnGen
-
-
-
name:
upset_plots_R12__censor_below_5__no_finngenids__2023-10-24__XXH64_84ac03a5853d9be4.tar.zst
-
source: Harri S, FinnGen Phenotype team
-
-
-
name:
codewas_endpoints_r11.filtered_nlog10p.green.2023-10-31.XXH64_3099f7b7f82bd251.jsonl.zst
-
source: Raw CodeWAS data from Javier G-T, FinnGen Phenotype Team; then applied
filter_codewas_greendata.py
-
-
-
name:
medical_codes_fg_code_info_v3_fg_codes_info_v3.csv
-
source: Javier G-T, FinnGen Phenotype Team
-
-
-
name:
labwas.XXH64_45ee892713aa9568.jsonl
-
source: Javier G-T, FinnGen Phenotype team; then keeping only green data.
-
-
Kanta lab – List of OMOP Concept IDs
-
name:
all_omop_ids.XXH64_3407a565a43950ce.jsonl
-
source: Risteys Kanta lab pipeline
-
-
-
name:
CONCEPT.xsv_fmt.XXH64_34860affdef58dc2.csv
-
source: Downloaded LOINC
CONCEPT.csv
from OHDSI Athena, CODE (CDM V5) = LOINC, Latest update = 18-Sep-2023, then applied proper CSV formatting (withxsv fmt -d"\t"
) to fix bad escaping in originalCONCEPT.csv
.
-
-
OMOP LOINC – Concept relationships
-
name:
CONCEPT_RELATIONSHIP.xsv_fmt.XXH64_e04d06ecdd63e0c1.csv
-
source: Downloaded LOINC
CONCEPT.csv
from OHDSI Athena, CODE (CDM V5) = LOINC, Latest update = 18-Sep-2023, then applied proper CSV formatting (withxsv fmt -d"\t"
) to fix bad escaping in originalCONCEPT_RELATIONSHIP.csv
.
-
-
-
name:
n_people_alive_in_kanta_time.XXH64_fc8010f7a8ad4a08.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Stats N People by sex
-
name:
count_by_sex.XXH64_55ab7400905bd063.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Median N measurements
-
name:
median_n_measurements.XXH64_11badf7e9f82495c.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Percent people with 2+ records
-
name:
percent_people_two_or_more_records.XXH64_972d77909204a603.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Median years from first to last record
-
name:
median_duration_first_to_last_record.XXH64_39b36dabb238daea.jsonl
-
source: Risteys Kanta lab pipeline
-
-
-
name:
qc_tables.XXH64_4e688fbd2d242617.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – QC tables distribution values
-
name:
qc_tables__distribution_measurement_value__stats.XXH64_effecc3338be8288.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – QC tables distributin binning
-
name:
qc_tables__distribution_measurement_value__bins_definitions.XXH64_e53042b0179eab9f.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – QC tables test outcomes
-
name:
qc_tables__test_outcome_counts.XXH64_c988a029dbd166a8.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution lab values / continuous / stats
-
name:
measurement_continuous_value_harmonized_distribution__stats.XXH64_ad2abf42d92ae80d.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution lab values / continuous / binning
-
name:
measurement_continuous_value_harmonized_distribution__bins_definitions.XXH64_dc446298aba8b55c.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution lab values / discrete
-
name:
measurement_discrete_value_harmonized_distribution__stats.XXH64_87049cbce30f44f8.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution year of birth / stats
-
name:
year_of_birth_distribution__stats.XXH64_ec4595969b4faf5c.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution year of brith / binning
-
name:
year_of_birth_distribution__bins_definitions.XXH64_4aca92fb74464f61.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution age at first measurement
-
name:
age_first_meas_distribution__stats.XXH64_f566d8fa7f7f211b.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution age at last measurement
-
name:
age_last_meas_distribution__stats.XXH64_c94a7b62616cf8aa.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution age at registry start
-
name:
age_registry_starts_distribution__stats.XXH64_vc8835448eff8ec36.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution age / binning
-
name:
age_distributions__bins_definitions.XXH64_263a39b76cfaf733.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution duration from first to last record / stats
-
name:
duration_first_to_last_distribution__stats.XXH64_6003a4f0e15f3431.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution duration from first to last record / binning
-
name:
duration_first_to_last_distribution__bins_definitions.XXH64_96485130b0e75ae1.json
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution N measurements over the years / stats
-
name:
n_measurements_over_years_distribution__stats.XXH64_dcc30754d0572a7c.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution N measurements over the years / binning
-
name:
n_measurements_over_years_distribution__bins_definitions.XXH64_71c05267fbc20fca.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution N measurements by person / stats
-
name:
n_measurements_per_person_distribution__stats.XXH64_f09eff1ce99c611a.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution N measurements by person / binning
-
name:
n_records_per_person_distribution__bins_definitions.XXH64_83018a99366a1bfa.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution value range / stats
-
name:
value_range_distributions__stats.XXH64_05ab452633715dbe.jsonl
-
source: Risteys Kanta lab pipeline
-
-
Kanta lab – Distribution value range / binning
-
name:
value_range_distributions__bins_definitions.XXH64_7ce4b5f6945e9831.jsonl
-
source: Risteys Kanta lab pipeline
-