costumer

The goal of costumer is to provide the data, the functions, scripts for the analyses and the documentation (report) within the relative templates for the paper Building Comprehensive Searches including PubMed and ClinicalTrials.gov Through a Machine Learning Approach for Systematic Reviews. Lanera et al. (2018)

Installation

You can install the development version from GitHub with the following procedure:

## If you do not have the `devtools` package installed, please install it
# install.packages("devtools")
devtools::install_github("UBESP-DCTV/costumer")

folders’ organization

R/ contains all the functions provided to implement the analyses
tests/ contains all the automated test to run for CI
man/ contains the documents for each function or data provided (accessible in R by ?<name_of_the_object>)
data-raw/ contains all the script used to import and manage the data used in the analyses and the (automated) tests of the package
data/ contains the data provided by the package. In particular, it contains:
- the customized caret models (used to incorporate a correct management of cross-validation process with textual data, especially for iDF reweighting) — *_cvAble.rda;
- the customized functions for the unbalance management — R[OU]S(3565|5050)_new.rda;
- sample data used in the automated tests — liu_*.rda.
inst/ contains one folder doc/ which contains:
- hutch_analyses_p1_v2.0.R, the script used to perform all the analyses reported in Lanera et al. (2018);
- AACT201603_comprehensive_data_dictionary.xlsx, the data-dictionary for the original clinicaltrial.gov data.

Note: the main data used are too huge to be included in an R package or in a GitHub repository. Here you can find a folder named non_git_nor_build_derived_data/ (2.86 GB) which contains:

171106-all_svm_3565/ folder with all the outputs of the last analyses:

CV-Plots/ folder which contains all the cross-validation plots representing the decision levels for the tuning parameter used in each model;

models/ folder which contains all the trained models;

hutch3.rda data which contains the hutch3 data frame containing all the data-step of the analyses, i.e., starting data, preprocessed data, DMT, testing, data, the model used, plots provided, … everything!

*.txt log files.

raw_pubmed/ folder with the data used to train the models, which are needed to run the script data-raw/import_pubmed.R. Hence, if you would like to run that script by yourself you need to put this folder as it is into the data-raw/ one.

raw_ctgov.zip zip file with the data used to test the models, i.e., the clinicaltrial.gov snapshot used and which is (when unzipped) needed to run the script data-raw/import_ctgov.R. Hence, if you would like to run that script by yourself you need to unzip this file and put the output folder as it is (~841 MB) into the data-raw/ one.

random4h28.xlsx file with the sample data used to (automated) test functions provided with the package, which is needed to run the script data-raw/import_liu.R. Hence, if you would like to run that script by yourself you need to put this file as it is into the data-raw/ folder.

summaries_*.rda the outputs ready-to-use of the functions import_*.R which are needed to run the script of the analyses. Hence, if you would like to run that script by yourself you need to put this files as they are into the data/ folder.

test_*.rda data which are the outputs of the function data-raw/ct_corpus_and_dtm.R which are also needed (and here are ready-to-use) to run the script of the analyses. Hence, if you would like to run that script by yourself, you need to put this files as they are into the data/ folder.

Bug reports

If you encounter a bug, please file a reprex (minimal reproducible example) to <https//github.com/UBESP-DCTV/imthcm/issues>

Reference

Lanera, Corrado, Clara Minto, Abhinav Sharma, Dario Gregori, Paola Berchialla, and Ileana Baldi. 2018. “Extending PubMed Searches to ClinicalTrials.gov Through a Machine Learning Approach for Systematic Reviews.” Jurnal of Clinical Epidemiology, no. 103:22–30. https://doi.org/10.1016/j.jclinepi.2018.06.015.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
R		R
data-raw		data-raw
data		data
inst/doc		inst/doc
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.travis.yml		.travis.yml
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.Rmd		README.Rmd
README.md		README.md
appveyor.yml		appveyor.yml
codecov.yml		codecov.yml
costumer.Rproj		costumer.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

costumer

Installation

folders’ organization

Bug reports

Reference

About

Releases

Packages

Languages

License

UBESP-DCTV/costumer

Folders and files

Latest commit

History

Repository files navigation

costumer

Installation

folders’ organization

Bug reports

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages