Releases: FINNGEN/kanta_lab_harmonisation_public
Releases · FINNGEN/kanta_lab_harmonisation_public
Kanta Harmonisation v2.0.0
Freeze for the use in first Kanta version
status n_codes n_records per_codes per_records
<chr> <int> <dbl> <dbl> <dbl>
1 SUCCESFUL 1538 149808107 0.0786 0.710
2 SUCCESFUL: no unit 2511 53572846 0.128 0.254
3 ERROR: Mapping: missing mapping 12635 2844183 0.646 0.0135
4 WARNING: Value: Values are significalty different 318 2514638 0.0163 0.0119
5 IGNORED: Mapping not found 102 1386982 0.00521 0.00657
6 ERROR: Value: missing units conversion 446 397440 0.0228 0.00188
7 WARNING: Value: KS test failed 797 187207 0.0407 0.000887
8 ERROR; Units: Units dont match quantity 163 174712 0.00833 0.000828
9 ERROR: Units: invalid source_unit_clean 468 99595 0.0239 0.000472
10 ERROR: Mapping: unknown abbreviation+unit 572 64835 0.0292 0.000307
11 ERROR; Mapping: Wrong mapping 11 19915 0.000562 0.0000944
Major changes :
- Accept all codes in measurement domain even if they are SNOMED
- Codes with 'ERROR; Mapping: cannot map without unit, multiple targets' are mapped to the most common unit, for codes with n events over 5000
Minor changes:
- Added mappings made by MP with Claude AI
Kanta Harmonisation v1.3.0
status n_codes n_records per_codes per_records
<chr> <int> <dbl> <dbl> <dbl>
1 SUCCESFUL 1514 144409820 0.0773 0.684
2 SUCCESFUL: no unit 2300 54226097 0.117 0.257
3 ERROR: Mapping: missing mapping 13325 6373496 0.680 0.0302
4 ERROR; Mapping: cannot map without unit, multiple targets 96 2614423 0.00490 0.0124
5 WARNING: Value: Values are significalty different 267 2197582 0.0136 0.0104
6 ERROR; Mapping: Wrong mapping 43 480170 0.00220 0.00227
7 WARNING: Value: KS test failed 794 176232 0.0405 0.000835
8 ERROR: Mapping: unknown abbreviation+unit 574 156473 0.0293 0.000741
9 ERROR; Units: Units dont match quantity 160 153011 0.00817 0.000725
10 ERROR: Value: missing units conversion 38 140172 0.00194 0.000664
11 ERROR: Units: invalid source_unit_clean 471 134057 0.0241 0.000635
12 ERROR; Mapping: Ambiguous mapping, same abbrebiation leukosyytit and Number Concentration maps to different conc… 2 12838 0.000102 0.0000608
Major changes :
- Added abbreviation with no unit combinations from FinnGen missing in LABfi_ALL.usagi.csv
- Accept all the classes of the LOINC as far as they are measurement domain
- Accept quantities 'Finding', 'Presence or identity' and 'Presense or threshold' to have NA units
- Accept 'Presense or threshold' estimate unit to be interchangeable with NA unit
Minor changes:
- Added few new mappings for panes from Elisa
- change 'u/field' to 'hpf' in UNITSfi.usagi.csv
- unit conversion to the top test with unit
- added abnormaliy colum distribution to viewer
Kanta Harmonisation v1.2.0
- Added new mappings from FinOMOP for the missing code with no unis
status n_codes n_records per_codes per_records
<chr> <int> <dbl> <dbl> <dbl>
1 SUCCESFUL 1444 137460674 0.0737 0.651
2 SUCCESFUL: no unit 2307 52079745 0.118 0.247
3 ERROR: Mapping: unknown abbreviation+unit 11015 7922607 0.562 0.0375
4 ERROR: Value: missing units conversion 438 5895930 0.0224 0.0279
5 ERROR; Mapping: cannot map without unit, multiple targets 101 4717159 0.00516 0.0223
6 WARNING: Value: Values are significalty different 205 1588191 0.0105 0.00752
7 ERROR: Mapping: missing mapping 2908 997520 0.148 0.00473
8 ERROR; Mapping: Wrong mapping 16 160598 0.000817 0.000761
9 ERROR: Units: invalid source_unit_clean 471 134057 0.0241 0.000635
10 ERROR; Units: Units dont match quantity 146 100459 0.00746 0.000476
11 WARNING: Value: KS test failed 533 17431 0.0272 0.0000826
Compare to previous release v1.1.0:
- added mappings increase 'SUCCESFUL: no unit '
Kanta Harmonisation v1.1.0
Major changes :
- Added 2816 new mappings to LABfi_ALL.usagi.csv for lab codes with no unit, these were created from the existing mappings if all lab codes with different units mapped to same conceptId
- modified check_lab_usagi_file, to create automatically mappings to lab codes with no units
status n_codes n_records per_codes per_records
<chr> <int> <dbl> <dbl> <dbl>
1 SUCCESFUL 1438 137439854 0.0734 0.651
2 SUCCESFUL: missing unit 1724 40861079 0.0880 0.194
3 ERROR: Mapping: unknown abbreviation+unit 11598 19141273 0.592 0.0907
4 ERROR: Value: missing units conversion 437 5895924 0.0223 0.0279
5 ERROR; Mapping: cannot map without unit, multiple targets 101 4717159 0.00516 0.0223
6 WARNING: Value: Values are significalty different 203 1585054 0.0104 0.00751
7 ERROR: Mapping: missing mapping 2908 997520 0.148 0.00473
8 ERROR; Mapping: Wrong mapping 16 160598 0.000817 0.000761
9 ERROR: Units: invalid source_unit_clean 471 134057 0.0241 0.000635
10 ERROR; Units: Units dont match quantity 146 100459 0.00746 0.000476
11 WARNING: Value: KS test failed 532 17315 0.0272 0.0000820
12 ERROR; Mapping: Ambiguous mapping, same abbrebiation leukosyytit and Number Concentration maps to … 2 12838 0.000102 0.0000608
13 ERROR; Mapping: Ambiguous mapping, same abbrebiation b-cd19 and Number Concentration maps to diffe… 2 6274 0.000102 0.0000297
14 ERROR; Mapping: Ambiguous mapping, same abbrebiation albumiini and Mass Concentration maps to diff… 2 3315 0.000102 0.0000157
15 ERROR; Mapping: Ambiguous mapping, same abbrebiation s-chpnaba and Arbitrary Concentration maps to… 2 1193 0.000102 0.00000565
16 ERROR; Mapping: Ambiguous mapping, same abbrebiation album and Mass Concentration maps to differen… 2 459 0.000102 0.00000217
Compare to previous relesae v1.0.0:
- Added status 'SUCCESFUL: missing unit' for these codes with no unit, these are consider SUCCESFUL.
- Still many ERROR: Mapping: unknown abbreviation+unit , this is bcs there is not mapping of a code with unit similar to these, they have to be added
- 'ERROR; Mapping: cannot map without unit, multiple targets', there are some codes with no unit that cannot be mapped, bcs the options with units are not compatible, eg
b-eos []
, can beb-eos [e9/l]
orb-eos [%]
which map to different conceptIds - 'ERROR; Mapping: Ambiguous mapping', when trying to create codes with no units we found 5 codes with different units that mapped to different conceptId
Kanta Harmonisation v1.0.0
Major changes :
- Update to work with summary of the real Finngen counts
Minor changes:
- Sam fixes in the usagi file
status n_codes n_records per_codes per_records
<chr> <int> <dbl> <dbl> <dbl>
1 SUCCESFUL 1667 139474520 0.0851 0.661
2 ERROR: Mapping: unknown abbreviation+unit 13423 64719511 0.685 0.307
3 ERROR: Value: missing units conversion 39 3813288 0.00199 0.0181
4 WARNING: Value: Values are significalty different 248 1647758 0.0127 0.00781
5 ERROR: Mapping: missing mapping 2908 997520 0.148 0.00473
6 ERROR; Mapping: Wrong mapping 16 160598 0.000817 0.000761
7 ERROR: Units: invalid source_unit_clean 471 134057 0.0241 0.000635
8 ERROR; Units: Units dont match quantity 146 100459 0.00746 0.000476
9 WARNING: Value: KS test failed 666 26660 0.0340 0.000126
Compare to previous version v0.2.0:
- the SUCCESFUL mappings drop to 66%, this is because now we are considering all the codes with missing unit
- Most of the codes with missing unit dont find a mapping in LABfi_ALL.usagi.csv, and fall into ERROR: Mapping: unknown abbreviation+unit
Kanta Harmonisation v0.2.0
Initial Harmonisation tables
Changes:
- Updates by Tarja, completed unmapped using mapped as reference
Status Summary:
9 × 5
status n_codes n_records per_codes per_records
<chr> <int> <dbl> <dbl> <dbl>
1 SUCCESFUL 1241 68280571 0.224 0.953
2 WARNING: Value: Values are significalty di… 750 1306986 0.136 0.0182
3 WARNING: Value: KS test failed 131 772668 0.0237 0.0108
4 ERROR: Mapping: missing mapping 2399 532296 0.434 0.00743
5 ERROR; Units: Units dont match quantity 264 340699 0.0477 0.00476
6 ERROR: Units: invalid source_unit_clean 406 160850 0.0734 0.00225
7 ERROR: Value: missing units conversion 42 120206 0.00759 0.00168
8 ERROR: Mapping: unknown abbreviation+unit 286 67911 0.0517 0.000948
9 ERROR; Mapping: Wrong mapping 13 42868 0.00235 0.000599
Kanta Harmonisation v0.1.0
Initial Harmonisation tables
Combines:
- LABfi usagi files from FinOMOP
- Fixed UNITSfi usagi deom FinOMOP
- Fixed maps by Tarja Laitines
Status Summary:
# A tibble: 9 × 5
status n_codes n_records per_codes per_records
<chr> <int> <dbl> <dbl> <dbl>
1 SUCCESFUL 1143 66137642 0.207 0.923
2 ERROR: Mapping: unknown abbreviation+unit 275 2064210 0.0497 0.0288
3 WARNING: Value: Values are significalty different 623 1257196 0.113 0.0176
4 WARNING: Value: KS test failed 106 772607 0.0192 0.0108
5 ERROR: Mapping: missing mapping 2643 682286 0.478 0.00953
6 ERROR; Units: Units dont match quantity 274 366567 0.0495 0.00512
7 ERROR: Units: invalid source_unit_clean 406 160850 0.0734 0.00225
8 ERROR: Value: missing units conversion 46 137426 0.00831 0.00192
9 ERROR; Mapping: Wrong mapping 17 46314 0.00307 0.000647