Penalized Decomposition Using Residuals (PeDecURe)

PeDecURe provides feature extraction with built-in adjustment for nuisance variables. Our method identifies sources of variation in that data that are shared between features (e.g., measurements derived from neuroimaging scans) and an outcome of interest (e.g., diagnosis), while substantially reducing overlap with information about nuisance variables (e.g., age or sex).

In our paper (now published in Biostatistics), we introduce the intuition behind our method and illustrate that features extracted using PeDecURe are predictive of an outcome of interest, have low correlations with nuisance variables, and show promise for out-of-sample generalizability.

Installation

# install.packages("devtools")
devtools::install_github("smweinst/PeDecURe")

Example

Notation:

X : feature matrix $(n \times p)$
A : matrix of nuisance variables $(n \times q)$
Y : vector (length ) with outcome labels (e.g., disease group)

Implementation using PeDecURe R package:

library(PeDecURe)

# get residuals:
resid.dat = get.resid(X,Y,A)
X.star = resid.dat$X.star
X.tilde = resid.dat$X.tilde

# tune lambda:
lambda.tune = pedecure.tune(X.orig = X,
                            X.max = X.star,
                            X.penalize = X.tilde,
                            lambdas = seq(0,10,by=0.1),
                            A = A,
                            Y = Y,
                            nPC = 3)
best.lambda = lambda.tune$lambda_tune

# run pedecure:
pedecure.out = pedecure(X = X.star,
                        X.penalize = X.tilde,
                        A = A,
                        Y = Y,
                        lambda = best.lambda,
                        nPC = 3)

# PC scores - these are our new features that can be used for an association study, predictive model, etc.
## note: X should be centered by column
PC.scores = X%*%pedecure.out$vectors

# Look at correlations between the first few PC scores and the nuisance variables (A1, A2, Y)
cor.scores = partial.cor(PC.scores, A, Y)
scores.partial.cor = cor.scores$partial$estimates
scores.marginal.cor = cor.scores$marginal$estimates

Applying PeDecURe output in a new sample

PC scores in new sample: multiply new feature matrix by PC loadings from above.

X.test : feature matrix in a different sample $(m \times p)$

PC.scores.test = X.test%*%pedecure.out$vectors
# note: pedecure.out$vectors was the output from applying PeDecURe in training sample above

If and are observed in the new sample, we can also look at their correlations with the PC scores in the test sample:

A.test : matrix of nuisance variables in the new sample $(m \times q)$ (if observed)
Y.test : vector (length ) with outcome labels (if observed)

cor.scores.test = partial.cor(PC.scores.test, A.test, Y.test)
scores.partial.cor.test = cor.scores.test$partial$estimates
scores.marginal.cor.test = cor.scores.test$marginal$estimates

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.Rproj.user		.Rproj.user
R		R
man		man
.DS_Store		.DS_Store
.Rbuildignore		.Rbuildignore
.Rhistory		.Rhistory
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
PeDecURe.Rproj		PeDecURe.Rproj
README.Rmd		README.Rmd
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Penalized Decomposition Using Residuals (PeDecURe)

Installation

Example

Applying PeDecURe output in a new sample

About

Releases

Packages

Languages

smweinst/PeDecURe

Folders and files

Latest commit

History

Repository files navigation

Penalized Decomposition Using Residuals (PeDecURe)

Installation

Example

Applying PeDecURe output in a new sample

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages