Skip to content

Files

Latest commit

40afb6f · Feb 10, 2025

History

History

web

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Nov 28, 2024
Nov 1, 2024
Feb 10, 2025
Feb 10, 2025
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Nov 20, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024
Oct 17, 2024

Risteys – Web portal

This directory contains the code to power the online FinRegistry-FinnGen Risteys web portal. For the data pipeline, please check the pipeline directory.

The project is based on the Phoenix web framework. This framework is written in elixir, which is also used to do data pre-processing and import data into the database. The front-end interactions are using Phoenix LiveView.

Setting up a development environment

Requirements

Configuration

Some configuration files contain secret credentials, therefore are not present in this public GitHub repository. Risteys developers can get them by requesting access to the GitLab repository risteys/risteys_secrets.

Running

Make sure you are in the directory risteys_elixir, then run the command

mix phx.server

You can now access Risteys at http://localhost:4000

Importing data

Once the Risteys web server is running we still need to put data there.

If you need to compute the statistics by yourself, then check the pipeline documentation.

When you have the data, you can proceed to import it into the Risteys database. The scripts to import data files into the Risteys database are in this current directory (risteys_elixir) and they should be run from that directory, for example:

mix run import_icd9.exs <path-to-input-file>

Import scripts should be run in this order:

  1. import_icd10.exs, using:

  2. import_icd9.exs, using:

  3. import_endpoint_csv.exs, using:

  4. import_intermediate_counts.exs, using:

  5. import_ontology.exs, using:

  6. import_excluded_endpoints.exs, using:

  7. import_key_figures.exs, using:

  8. import_key_figures.exs, using:

  9. import_key_figures.exs, using:

  10. import_distributions.exs, using:

  11. import_distributions.exs, using:

  12. import_distributions.exs, using:

  13. import_distributions.exs, using:

  14. import_stats_cumulative_incidence.exs, using:

  15. import_stats_cumulative_incidence.exs, using:

  16. import_interactive_mortality_baseline.exs, using:

  17. import_interactive_mortality_params.exs, using:

  18. import_mortality_counts.exs, using:

  19. import_correlation.exs, using:

  20. import_case_overlaps_fr.exs, using:

  21. import_coxhr.exs, using

  22. import_genetic_correlations.exs, using

  23. import_genes.exs, using

  24. import_upset_plots.exs, using

  25. import_codewas.exs, using

  26. Risteys.LabWAS.import_stats, using:

  27. Risteys.OMOP.import_lab_test_loinc_concepts, using:

  28. Risteys.LabTestStats.import_dataset_metadata, using:

  29. Risteys.LabTestStats.import_stats_npeople, using:

  30. Risteys.LabTestStats.import_stats_median_n_measurements, using:

  31. Risteys.LabTestStats.import_stats_people_with_two_plus_records, using:

  32. Risteys.LabTestStats.import_stats_median_years_first_to_last, using:

  33. Risteys.LabTestStats.import_qc_tables, using:

  34. Risteys.LabTestStats.import_stats_distribution_lab_values, using:

  35. Risteys.LabTestStats.import_stats_distribution_year_of_birth, using:

  36. Risteys.LabTestStats.import_stats_distribution_age_first_measurement, using:

  37. Risteys.LabTestStats.import_stats_distribution_age_last_measurement, using:

  38. Risteys.LabTestStats.import_stats_distribution_age_start_of_registry, using:

  39. Risteys.LabTestStats.import_stats_distribution_duration_first_to_last_measurement, using:

  40. Risteys.LabTestStats.import_stats_distribution_n_measurement_over_years, using:

  41. Risteys.LabTestStats.import_stats_distribution_n_measurements_per_person, using:

  42. Risteys.LabTestStats.import_stats_distribution_value_range_per_person, using:

File list

  • ICD-10

  • FinnGen medcode

    • name: finngen_R9_medcode_ref__XXH64_708053b379a04020.tsv

    • source: FinnGen - Library Green

  • FinnGen endpoint definitions

    • name: finngen_R12_endpoint_core_noncore_1.0.added_omit2__XXH64_399efaa48ca282b6.csv

    • source: Merging of finngen_R12_endpoint_core_noncore_1.0.xlsx and OMIT column from Endpoints_Controls_FINNGEN_ENDPOINTS_DF12_Final_2023-05-17.xlsx - FinnGen clinical team - GitHub

  • FinnGen endpoint main tag

    • name: FINNGEN_ENDPOINTS_DF12_Final_2023-05-17.names_tagged_ordered__XXH64_8264f1235f3f7221.txt

    • source: FinnGen clinical team - GitHub

  • FinnGen endpoint tag list

    • name: TAGLIST_DF12__XXH64_2c6dae042382fea9.csv

    • source: FinnGen clinical team - GitHub

  • FinnGen endpoint selected core

    • name: finngen_correlation_clusters_DF8__XXH64_0d9f3a10306791f5.csv

    • source: FinnGen clinical team

  • FinnGen endpoint intermediate counts

    • name: finngen_endpoints_intermediate_counts_green_export_R12_v1__XXH64_85e199bb39d62337.txt

    • source: FinnGen registry team

  • Risteys ontology

    • name: finngen_ontology_2022-08-22__XXH64_2a8d4690fa4ae89a.json

    • source: Risteys pipeline

  • Risteys corrected endpoint description

    • name: corrected-endpoint-descriptions.airtable-export.2023-10-10__XXH64_b0bd5eb161441ba9.csv

    • source: Risteys Airtable

  • FinRegistry excluded endpoints

    • name: excluded_endpoints_FR_Risteys_R12__XXH64_508917188be68559.csv

    • source: Risteys script exclude_endpoints_finregistry.py

  • Risteys FinRegistry key figures, all individuals

    • name: key_figures_all_2022-10-10_with_EXALLC_EXMORE__XXH64_920b310de04e72e7.csv

    • source: Risteys pipeline

  • Risteys FinRegistry key figures, only index-persons

    • name: key_figures_index_2022-10-10_with_EXALLC_EXMORE__XXH64_c62d6a466a0512a1.csv

    • source: Risteys pipeline

  • Risteys FinnGen key figures

    • name: key_figures_all_2023-09-20__XXH64_93a5ba6f09958693.csv

    • source: Risteys pipeline

  • Risteys FinRegistry age distributions

    • name: distribution_age_2022-10-10_with_EXALLC_EXMORE__XXH64_edd7be5c03a84317.csv

    • source: Risteys pipeline

  • Risteys FinRegistry year distributions

    • name: distribution_year_2022-10-10_with_EXALLC_EXMORE__XXH64_a5ea390cd797b6e3.csv

    • source: Risteys pipeline

  • Risteys FinnGen age distributions

    • name: distribution_age_2023-09-20__XXH64_0ab1f53d7d3013f7.csv

    • source: Risteys pipeline

  • Risteys FinnGen year distributions

    • name: distribution_year_2023-09-20__XXH64_90ee66ed48dfb5fb.csv

    • source: Risteys pipeline

  • Risteys FinRegistry cumulative incidence

    • name: cumulative_incidence_2022-10-10_with_EXALLC_EXMORE__XXH64_c08ae173edf55e72.csv

    • source: Risteys pipeline

  • Risteys FinnGen cumulative incidence

    • name: all_cumulative_incidence__r12__2023-09-20__XXH64_f4909d1f5b2565ee.csv

    • source: Risteys pipeline

  • Risteys FinRegistry mortality baseline cumulative hazards

    • name: mortality_baseline_cumulative_hazard_2022-10-11_with_EXALLC_EXMORE__XXH64_0088608aa7e021bd.csv

    • source: Risteys pipeline

  • Risteys FinRegistry mortality parameters

    • name: mortality_params_2022-10-11_with_EXALLC_EXMORE__XXH64_8f4fdc15e1c061c1.csv

    • source: Risteys pipeline

  • Risteys FinRegistry mortality counts

    • name: mortality_counts_2022-10-11_with_EXALLC_EXMORE__XXH64_f7f9581772ec80c6.csv

    • source: Risteys pipeline

  • FinnGen phenotypic + genotypic correlations

    • name: corr_pheno-fg-r12.0_geno-fg-r12.0_full-join__2023-11-15__XXH64_86c9c7a833d663e5.csv.zst

    • source: FinnGen correlation pipeline for the phenotypic file, merged with genotypic correlation file from FinnGen analysis team

  • FinnGen coloc variants

    • name: r12.autoreport.compare.keep_cs.r2_0.8.pval_5e_8.variants__XXH64_854d4aeb62e9664a.csv

    • source: FinnGen analysis team

  • FinRegistry case overlaps

    • name: case_overlap_2022-12-31__XXH64_fb1ca5ba80e4a0ba.csv.zst

    • source: Risteys pipeline

  • FinRegistry survival analysis results

    • name: surv_priority_endpoints_2022-12-25__XXH64_b92220411f705ef2.csv

    • source: Risteys pipeline

  • FinnGen genetic correlations

    • name: finngen_R12_FIN.ldsc.summary__XXH64_40dc9830272f8976.tsv

    • source: FinnGen Green library

  • HAVANA gene list

    • name: havana__XXH64_085a38684d85191e.json

    • source: HAVANA through FinnGen

  • , Upset Plots

    • name: upset_plots_R12__censor_below_5__no_finngenids__2023-10-24__XXH64_84ac03a5853d9be4.tar.zst

    • source: Harri S, FinnGen Phenotype team

  • CodeWAS Endpoints

    • name: codewas_endpoints_r11.filtered_nlog10p.green.2023-10-31.XXH64_3099f7b7f82bd251.jsonl.zst

    • source: Raw CodeWAS data from Javier G-T, FinnGen Phenotype Team; then applied filter_codewas_greendata.py

  • CodeWAS code list

    • name: medical_codes_fg_code_info_v3_fg_codes_info_v3.csv

    • source: Javier G-T, FinnGen Phenotype Team

  • LabWAS data

    • name: labwas.XXH64_45ee892713aa9568.jsonl

    • source: Javier G-T, FinnGen Phenotype team; then keeping only green data.

  • Kanta lab – List of OMOP Concept IDs

    • name: all_omop_ids.XXH64_3407a565a43950ce.jsonl

    • source: Risteys Kanta lab pipeline

  • OMOP LOINC – Concepts

    • name: CONCEPT.xsv_fmt.XXH64_34860affdef58dc2.csv

    • source: Downloaded LOINC CONCEPT.csv from OHDSI Athena, CODE (CDM V5) = LOINC, Latest update = 18-Sep-2023, then applied proper CSV formatting (with xsv fmt -d"\t") to fix bad escaping in original CONCEPT.csv.

  • OMOP LOINC – Concept relationships

    • name: CONCEPT_RELATIONSHIP.xsv_fmt.XXH64_e04d06ecdd63e0c1.csv

    • source: Downloaded LOINC CONCEPT.csv from OHDSI Athena, CODE (CDM V5) = LOINC, Latest update = 18-Sep-2023, then applied proper CSV formatting (with xsv fmt -d"\t") to fix bad escaping in original CONCEPT_RELATIONSHIP.csv.

  • Kanta lab – Metadata

    • name: n_people_alive_in_kanta_time.XXH64_fc8010f7a8ad4a08.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Stats N People by sex

    • name: count_by_sex.XXH64_55ab7400905bd063.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Median N measurements

    • name: median_n_measurements.XXH64_11badf7e9f82495c.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Percent people with 2+ records

    • name: percent_people_two_or_more_records.XXH64_972d77909204a603.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Median years from first to last record

    • name: median_duration_first_to_last_record.XXH64_39b36dabb238daea.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – QC tables stats

    • name: qc_tables.XXH64_4e688fbd2d242617.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – QC tables distribution values

    • name: qc_tables__distribution_measurement_value__stats.XXH64_effecc3338be8288.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – QC tables distributin binning

    • name: qc_tables__distribution_measurement_value__bins_definitions.XXH64_e53042b0179eab9f.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – QC tables test outcomes

    • name: qc_tables__test_outcome_counts.XXH64_c988a029dbd166a8.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution lab values / continuous / stats

    • name: measurement_continuous_value_harmonized_distribution__stats.XXH64_ad2abf42d92ae80d.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution lab values / continuous / binning

    • name: measurement_continuous_value_harmonized_distribution__bins_definitions.XXH64_dc446298aba8b55c.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution lab values / discrete

    • name: measurement_discrete_value_harmonized_distribution__stats.XXH64_87049cbce30f44f8.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution year of birth / stats

    • name: year_of_birth_distribution__stats.XXH64_ec4595969b4faf5c.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution year of brith / binning

    • name: year_of_birth_distribution__bins_definitions.XXH64_4aca92fb74464f61.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution age at first measurement

    • name: age_first_meas_distribution__stats.XXH64_f566d8fa7f7f211b.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution age at last measurement

    • name: age_last_meas_distribution__stats.XXH64_c94a7b62616cf8aa.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution age at registry start

    • name: age_registry_starts_distribution__stats.XXH64_vc8835448eff8ec36.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution age / binning

    • name: age_distributions__bins_definitions.XXH64_263a39b76cfaf733.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution duration from first to last record / stats

    • name: duration_first_to_last_distribution__stats.XXH64_6003a4f0e15f3431.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution duration from first to last record / binning

    • name: duration_first_to_last_distribution__bins_definitions.XXH64_96485130b0e75ae1.json

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution N measurements over the years / stats

    • name: n_measurements_over_years_distribution__stats.XXH64_dcc30754d0572a7c.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution N measurements over the years / binning

    • name: n_measurements_over_years_distribution__bins_definitions.XXH64_71c05267fbc20fca.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution N measurements by person / stats

    • name: n_measurements_per_person_distribution__stats.XXH64_f09eff1ce99c611a.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution N measurements by person / binning

    • name: n_records_per_person_distribution__bins_definitions.XXH64_83018a99366a1bfa.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution value range / stats

    • name: value_range_distributions__stats.XXH64_05ab452633715dbe.jsonl

    • source: Risteys Kanta lab pipeline

  • Kanta lab – Distribution value range / binning

    • name: value_range_distributions__bins_definitions.XXH64_7ce4b5f6945e9831.jsonl

    • source: Risteys Kanta lab pipeline