Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rowData to summed SCE output #20

Merged
merged 5 commits into from
Dec 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@
^\.lintr$
^\.pre-commit-config.yaml$
^data-raw$
^dependencies.R$
^LICENSE\.md$
6 changes: 5 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ repos:
exclude: '\.Rd'

- repo: https://github.com/crate-ci/typos
rev: v1.28.2
rev: v1.28.3
hooks:
- id: typos
exclude: '\.nb\.html'
Expand All @@ -41,3 +41,7 @@ repos:
- id: no-debug-statement
- id: deps-in-desc
exclude: 'docker/.*|renv/.*|data-raw/.*'

ci:
autofix_prs: true
autoupdate_schedule: quarterly
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ Suggests:
scran,
Seurat,
splatter,
scuttle,
Matrix,
SeuratObject
Config/testthat/edition: 3
Expand All @@ -37,6 +36,7 @@ Imports:
pdfCluster,
purrr,
S4Vectors,
scuttle,
SingleCellExperiment,
SummarizedExperiment,
tibble,
Expand Down
44 changes: 31 additions & 13 deletions R/sum-duplicate-genes.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,11 @@
#' substantial sequence identity, which could make separate quantification of
#' the two genes less reliable.
#'
#' The rowData for the summed SingleCellExperiment object is updated to reflect
#' the new set of gene names. In each case, the first row for any duplicated id
#' is retained. This may mean that for gene symbols that correspond to multiple
#' Ensembl ids, the first Ensembl id is retained and the others are dropped.
#'
#' If requested, the log-normalized expression values are recalculated,
#' otherwise that matrix is left blank.
#'
Expand Down Expand Up @@ -49,9 +54,7 @@ sum_duplicate_genes <- function(sce, normalize = TRUE, recalculate_reduced_dims
if (normalize) {
stopifnot(
"Package `scran` must be installed if `normalize = TRUE` is set." =
requireNamespace("scran", quietly = TRUE),
"Package `scuttle` must be installed if `normalize = TRUE` is set." =
requireNamespace("scuttle", quietly = TRUE)
requireNamespace("scran", quietly = TRUE)
)
}
stopifnot(
Expand All @@ -65,14 +68,20 @@ sum_duplicate_genes <- function(sce, normalize = TRUE, recalculate_reduced_dims
}

# calculate the reduced matrices
counts <- rowsum(counts(sce), rownames(sce)) |> as("sparseMatrix")
unique_rows <- unique(rownames(sce)) # new row names
counts <- rowsum(counts(sce), rownames(sce))[unique_rows, ] |> # keep order, mostly
as("sparseMatrix")
if ("spliced" %in% assayNames(sce)) {
spliced <- rowsum(assay(sce, "spliced"), rownames(sce)) |> as("sparseMatrix")
spliced_names <- rownames(assay(sce, "spliced"))
spliced <- rowsum(assay(sce, "spliced"), spliced_names)[unique(spliced_names), ] |>
as("sparseMatrix")
assays <- list(counts = counts, spliced = spliced)
} else {
assays <- list(counts = counts)
}

# regenerate rowData, using first row for each duplicate
row_data <- rowData(sce)[unique_rows, ]

if (recalculate_reduced_dims) {
reduced_dims <- list()
Expand All @@ -81,26 +90,35 @@ sum_duplicate_genes <- function(sce, normalize = TRUE, recalculate_reduced_dims
}



# Build the new SingleCellExperiment object
summed_sce <- SingleCellExperiment(
assays = assays,
rowData = row_data,
colData = colData(sce),
metadata = metadata(sce),
# if we are not recalculating reduced dimensions, copy over previous (likely similar)
reducedDims = reduced_dims,
altExps = altExps(sce)
)
# remove and replace existing Feature stats
rowData(summed_sce)$mean <- NULL
rowData(summed_sce)$detected <- NULL
summed_sce <- scuttle::addPerFeatureQCMetrics(summed_sce)

# Add normalized values if requested
if (normalize) {
try({
# try to cluster similar cells
# clustering may fail if < 100 cells in dataset
suppressWarnings({
qclust <- scran::quickCluster(summed_sce)
summed_sce <- scran::computeSumFactors(summed_sce, clusters = qclust)
})
})
try(
{
# try to cluster similar cells
# clustering may fail if < 100 cells in dataset
suppressWarnings({
qclust <- scran::quickCluster(summed_sce)
summed_sce <- scran::computeSumFactors(summed_sce, clusters = qclust)
})
},
silent = TRUE
)
summed_sce <- scuttle::logNormCounts(summed_sce)
}

Expand Down
3 changes: 3 additions & 0 deletions dependencies.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# development dependencies for renv

library("devtools")
5 changes: 5 additions & 0 deletions man/sum_duplicate_genes.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading