issue running package example script #4

rcorty · 2023-11-11T20:39:49Z

I ran the following script, copied pretty much exactly from the package example script

and was met with:

Error in `inner_join()`:
! Join columns in `y` must be present in the data.
✖ Problem with `vocabulary_id` and `code`.
Traceback:

1. createPhenotypes(ex$id.vocab.code.count, aggregate.fun = sum, 
 .     id.sex = ex$id.sex, vocabulary.map = phecodeX_map, rollup.map = phecodeX_rollup_map, 
 .     sex.restriction = phecodeX_sex)
2. mapCodesToPhecodes(id.vocab.code.index, make.distinct = map.codes.make.distinct, 
 .     vocabulary.map = vocabulary.map, rollup.map = rollup.map) %>% 
 .     transmute(id, code = phecode, index)
3. transmute(., id, code = phecode, index)
4. mapCodesToPhecodes(id.vocab.code.index, make.distinct = map.codes.make.distinct, 
 .     vocabulary.map = vocabulary.map, rollup.map = rollup.map)
5. withCallingHandlers(output <- inner_join(input, vocabulary.map, 
 .     by = c("vocabulary_id", "code")), warning = function(w) {
 .     if (grepl("coercing into character vector", w$message)) {
 .         invokeRestart("muffleWarning")
 .     }
 . })
6. inner_join(input, vocabulary.map, by = c("vocabulary_id", "code"))
7. inner_join.data.frame(input, vocabulary.map, by = c("vocabulary_id", 
 .     "code"))
8. join_mutate(x = x, y = y, by = by, type = "inner", suffix = suffix, 
 .     na_matches = na_matches, keep = keep, multiple = multiple, 
 .     unmatched = unmatched, relationship = relationship, user_env = caller_env())
9. join_cols(x_names = x_names, y_names = y_names, by = by, suffix = suffix, 
 .     keep = keep, error_call = error_call)
10. check_join_vars(by$y, y_names, by$condition, "y", error_call = error_call)
11. abort(bullets, call = error_call)
12. signal_abort(cnd, .file)

devtools::install_github("PheWAS/PheWAS")

library(PheWAS)

ex=generateExample()

phecodeX_labels = read.csv('https://github.com/PheWAS/PhecodeX/raw/main/phecodeX_R_labels.csv')
phecodeX_rollup_map = read.csv('https://github.com/PheWAS/PhecodeX/raw/main/phecodeX_R_rollup_map.csv')
phecodeX_map = read.csv('https://github.com/PheWAS/PhecodeX/raw/main/phecodeX_R_labels.csv')
phecodeX_sex = read.csv('https://github.com/PheWAS/PhecodeX/raw/main/phecodeX_R_sex.csv')

createPhenotypes(ex$id.vocab.code.count,
                 aggregate.fun = sum, 
                 id.sex = ex$id.sex,
                 vocabulary.map = phecodeX_map,
                 rollup.map = phecodeX_rollup_map,
                 sex.restriction = phecodeX_sex)

environment information is:

R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] PheWAS_0.99.6-1 ggplot2_3.4.4   tidyr_1.3.0     dplyr_1.1.3    

loaded via a namespace (and not attached):
 [1] jsonlite_1.8.7       splines_4.2.2        foreach_1.5.2       
 [4] metafor_4.4-0        ggrepel_0.9.4        numDeriv_2016.8-1.1 
 [7] pillar_1.9.0         backports_1.4.1      lattice_0.20-45     
[10] glue_1.6.2           uuid_1.1-0           digest_0.6.33       
[13] meta_6.5-0           minqa_1.2.6          colorspace_2.1-0    
[16] htmltools_0.5.7      Matrix_1.5-3         pkgconfig_2.0.3     
[19] broom_1.0.5          purrr_1.0.2          scales_1.2.1        
[22] metadat_1.2-0        lme4_1.1-35.1        tibble_3.2.1        
[25] mgcv_1.8-41          generics_0.1.3       DT_0.30             
[28] withr_2.5.2          repr_1.1.4           pan_1.9             
[31] nnet_7.3-18          formula.tools_1.7.1  cli_3.6.1           
[34] survival_3.4-0       magrittr_2.0.3       crayon_1.5.2        
[37] mitml_0.4-5          evaluate_0.23        mice_3.16.0         
[40] fansi_1.0.5          operator.tools_1.6.3 nlme_3.1-160        
[43] MASS_7.3-58.1        xml2_1.3.5           tools_4.2.2         
[46] lifecycle_1.0.4      munsell_0.5.0        glmnet_4.1-8        
[49] compiler_4.2.2       logistf_1.26.0       rlang_1.1.2         
[52] grid_4.2.2           nloptr_2.0.3         pbdZMQ_0.3-8        
[55] iterators_1.0.14     IRkernel_1.3.1       CompQuadForm_1.4.3  
[58] htmlwidgets_1.6.2    base64enc_0.1-3      boot_1.3-28.1       
[61] gtable_0.3.4         codetools_0.2-18     R6_2.5.1            
[64] zoo_1.8-12           fastmap_1.1.1        utf8_1.2.4          
[67] mathjaxr_1.6-0       jomo_2.7-6           shape_1.4.6         
[70] IRdisplay_1.1        Rcpp_1.0.11          vctrs_0.6.4         
[73] rpart_4.1.19         lmtest_0.9-40        tidyselect_1.2.0

The text was updated successfully, but these errors were encountered:

emilyvansyoc · 2024-12-10T17:46:08Z

I had the same problem. Any updates?

rcorty · 2024-12-10T20:16:54Z

I think people are just rolling their own phewas scripts these days

RobertJCarroll · 2025-01-02T16:23:56Z

Apologies for missing this originally- I don't keep an eye on this repo. Two notes:

@adlewismbb is working on updating the PheWAS package for CRAN + making it easier to use PhecodeX and other maps.
You are loading the wrong file in this line; it loads the labels into the map variable:
phecodeX_map = read.csv('https://github.com/PheWAS/PhecodeX/raw/main/phecodeX_R_labels.csv')
It should be:
phecodeX_map = read.csv('https://github.com/PheWAS/PhecodeX/raw/main/phecodeX_R_map.csv')
It works as anticipated when that change is made.

chrisodhams · 2025-01-02T16:41:36Z

Note that for the phecodeX_R_map_ICD_10_WHO.csv file, one needs to reorder and rename the columns for it to work, e.g.:

phecodex_mapicd10 <- read_csv("phecodeX_R_map_ICD_10_WHO.csv", col_names = TRUE) %>% select(code = ICD, vocabulary_id, phecode)

RobertJCarroll · 2025-01-02T17:21:16Z

Thanks for that note. Order doesn't matter in my tests, which reflects expected behavior, but the name change would be helpful. I've made a PR #5 and will ask about getting the outstanding ones merged.

RobertJCarroll closed this as completed Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

issue running package example script #4

issue running package example script #4

rcorty commented Nov 11, 2023

emilyvansyoc commented Dec 10, 2024

rcorty commented Dec 10, 2024

RobertJCarroll commented Jan 2, 2025

chrisodhams commented Jan 2, 2025

RobertJCarroll commented Jan 2, 2025

issue running package example script #4

issue running package example script #4

Comments

rcorty commented Nov 11, 2023

emilyvansyoc commented Dec 10, 2024

rcorty commented Dec 10, 2024

RobertJCarroll commented Jan 2, 2025

chrisodhams commented Jan 2, 2025

RobertJCarroll commented Jan 2, 2025