Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All cells filtered out when using bam as input #146

Open
Josephinedh opened this issue Nov 18, 2024 · 2 comments
Open

All cells filtered out when using bam as input #146

Josephinedh opened this issue Nov 18, 2024 · 2 comments

Comments

@Josephinedh
Copy link

Operating System

Windows 10

Other Linux

Red Hat Enterprise Linux 8.10 (Ootpa)

Workflow Version

v2.3.0

Workflow Execution

EPI2ME Desktop (Local)

Other workflow execution

No response

EPI2ME Version

No response

CLI command run

nextflow run epi2me-labs/wf-single-cell
--expected_cells 15000
--bam tagged.bam
-r v2.3.0
--kit '3prime:v3'
--ref_genome_dir refdata-gex-GRCh38-2020-A
-profile singularity
--sample GBMsample

Workflow Execution - CLI Execution Profile

None

What happened?

I wanted to run the workflow with a bam file as input.

I have two sequencing runs from different libraries of the same cDNA, that I would like to combine to a single count matrix. I was therefore thinking of running them each separately and then combine the bam files and rerun it from the combined bam file.

Therefore, as a test I tried using the tagged.bam of one of the runs as input. But somehow, the workflow filters out almost all cells, and I end up getting the error: "ValueError: All cells would be removed, try altering filter thresholds."

Relevant log output

executor >  local (363)
[e2/ea4e22] process > checkBamHeaders (1)            [100%] 1 of 1 ✔
[94/12e7a8] process > fastcat (1)                    [100%] 1 of 1 ✔
[75/23d6aa] process > parse_kit_metadata (1)         [100%] 1 of 1, cached: 1 ✔
[b2/4492ee] process > pipeline:getVersions           [100%] 1 of 1, cached: 1 ✔
[d7/cafcdc] process > pipeline:getParams             [100%] 1 of 1, cached: 1 ✔
[4b/abc05e] process > pipeline:preprocess:call_pa... [100%] 1 of 1, cached: 1 ✔
[83/74c63b] process > pipeline:preprocess:build_m... [100%] 1 of 1, cached: 1 ✔
[25/5b2d00] process > pipeline:preprocess:call_ad... [100%] 109 of 109 ✔
[e0/fc5053] process > pipeline:preprocess:summari... [100%] 1 of 1 ✔
[d7/c03493] process > pipeline:process_bams:split... [100%] 1 of 1, cached: 1 ✔
[4e/c48f79] process > pipeline:process_bams:gener... [100%] 1 of 1 ✔
[4f/795c75] process > pipeline:process_bams:assig... [100%] 109 of 109 ✔
[f2/962e2d] process > pipeline:process_bams:cat_t... [100%] 1 of 1 ✔
[04/6fa8f2] process > pipeline:process_bams:merge... [100%] 1 of 1 ✔
[63/a17ea5] process > pipeline:process_bams:strin... [100%] 40 of 40 ✔
[9f/21f024] process > pipeline:process_bams:align... [100%] 40 of 40 ✔
[91/b27634] process > pipeline:process_bams:assig... [100%] 26 of 26 ✔
[1b/243a46] process > pipeline:process_bams:creat... [100%] 26 of 26 ✔
[8a/904fb2] process > pipeline:process_bams:proce... [  0%] 0 of 2
[e9/ebb75e] process > pipeline:process_bams:merge... [100%] 1 of 1 ✔
[a8/38092e] process > pipeline:process_bams:combi... [100%] 1 of 1 ✔
[c3/b38426] process > pipeline:process_bams:tag_b... [100%] 1 of 1 ✔
[ea/565b39] process > pipeline:process_bams:umi_g... [100%] 1 of 1 ✔
[f2/8d9d71] process > pipeline:process_bams:pack_... [  0%] 0 of 1
[-        ] process > pipeline:prepare_report_data   -
[-        ] process > pipeline:makeReport            -

executor >  local (363)
[e2/ea4e22] process > checkBamHeaders (1)            [100%] 1 of 1 ✔
[94/12e7a8] process > fastcat (1)                    [100%] 1 of 1 ✔
[75/23d6aa] process > parse_kit_metadata (1)         [100%] 1 of 1, cached: 1 ✔
[b2/4492ee] process > pipeline:getVersions           [100%] 1 of 1, cached: 1 ✔
[d7/cafcdc] process > pipeline:getParams             [100%] 1 of 1, cached: 1 ✔
[4b/abc05e] process > pipeline:preprocess:call_pa... [100%] 1 of 1, cached: 1 ✔
[83/74c63b] process > pipeline:preprocess:build_m... [100%] 1 of 1, cached: 1 ✔
[25/5b2d00] process > pipeline:preprocess:call_ad... [100%] 109 of 109 ✔
[e0/fc5053] process > pipeline:preprocess:summari... [100%] 1 of 1 ✔
[d7/c03493] process > pipeline:process_bams:split... [100%] 1 of 1, cached: 1 ✔
[4e/c48f79] process > pipeline:process_bams:gener... [100%] 1 of 1 ✔
[4f/795c75] process > pipeline:process_bams:assig... [100%] 109 of 109 ✔
[f2/962e2d] process > pipeline:process_bams:cat_t... [100%] 1 of 1 ✔
[04/6fa8f2] process > pipeline:process_bams:merge... [100%] 1 of 1 ✔
[63/a17ea5] process > pipeline:process_bams:strin... [100%] 40 of 40 ✔
[9f/21f024] process > pipeline:process_bams:align... [100%] 40 of 40 ✔
[91/b27634] process > pipeline:process_bams:assig... [100%] 26 of 26 ✔
[1b/243a46] process > pipeline:process_bams:creat... [100%] 26 of 26 ✔
[8a/904fb2] process > pipeline:process_bams:proce... [  0%] 0 of 2
[e9/ebb75e] process > pipeline:process_bams:merge... [100%] 1 of 1 ✔
[a8/38092e] process > pipeline:process_bams:combi... [100%] 1 of 1 ✔
[c3/b38426] process > pipeline:process_bams:tag_b... [100%] 1 of 1 ✔
[ea/565b39] process > pipeline:process_bams:umi_g... [100%] 1 of 1 ✔
[f2/8d9d71] process > pipeline:process_bams:pack_... [100%] 1 of 1 ✔
[-        ] process > pipeline:prepare_report_data   -
[-        ] process > pipeline:makeReport            -

executor >  local (363)
[e2/ea4e22] process > checkBamHeaders (1)            [100%] 1 of 1 ✔
[94/12e7a8] process > fastcat (1)                    [100%] 1 of 1 ✔
[75/23d6aa] process > parse_kit_metadata (1)         [100%] 1 of 1, cached: 1 ✔
[b2/4492ee] process > pipeline:getVersions           [100%] 1 of 1, cached: 1 ✔
[d7/cafcdc] process > pipeline:getParams             [100%] 1 of 1, cached: 1 ✔
[4b/abc05e] process > pipeline:preprocess:call_pa... [100%] 1 of 1, cached: 1 ✔
[83/74c63b] process > pipeline:preprocess:build_m... [100%] 1 of 1, cached: 1 ✔
[25/5b2d00] process > pipeline:preprocess:call_ad... [100%] 109 of 109 ✔
[e0/fc5053] process > pipeline:preprocess:summari... [100%] 1 of 1 ✔
[d7/c03493] process > pipeline:process_bams:split... [100%] 1 of 1, cached: 1 ✔
[4e/c48f79] process > pipeline:process_bams:gener... [100%] 1 of 1 ✔
[4f/795c75] process > pipeline:process_bams:assig... [100%] 109 of 109 ✔
[f2/962e2d] process > pipeline:process_bams:cat_t... [100%] 1 of 1 ✔
[04/6fa8f2] process > pipeline:process_bams:merge... [100%] 1 of 1 ✔
[63/a17ea5] process > pipeline:process_bams:strin... [100%] 40 of 40 ✔
[9f/21f024] process > pipeline:process_bams:align... [100%] 40 of 40 ✔
[91/b27634] process > pipeline:process_bams:assig... [100%] 26 of 26 ✔
[1b/243a46] process > pipeline:process_bams:creat... [100%] 26 of 26 ✔
[8a/904fb2] process > pipeline:process_bams:proce... [  0%] 0 of 2
[e9/ebb75e] process > pipeline:process_bams:merge... [100%] 1 of 1 ✔
[a8/38092e] process > pipeline:process_bams:combi... [100%] 1 of 1 ✔
[c3/b38426] process > pipeline:process_bams:tag_b... [100%] 1 of 1 ✔
[ea/565b39] process > pipeline:process_bams:umi_g... [100%] 1 of 1 ✔
[f2/8d9d71] process > pipeline:process_bams:pack_... [100%] 1 of 1 ✔
[-        ] process > pipeline:prepare_report_data   -
[-        ] process > pipeline:makeReport            -
ERROR ~ Error executing process > 'pipeline:process_bams:process_matrix (2)'

Caused by:
  Process `pipeline:process_bams:process_matrix (2)` terminated with an error exit status (1)

Command executed:

  export NUMBA_NUM_THREADS=1
  workflow-glue process_matrix         inputs/matrix*.hdf         --feature gene         --raw gene_raw_feature_bc_matrix         --processed gene_processed_feature_bc_matrix         --per_cell_mito gene.expression.mito-per-cell.tsv         --per_cell_expr gene.expression.mean-per-cell.tsv         --umap_tsv gene.expression.umap.tsv         --enable_filtering         --min_features 200         --min_cells 3         --max_mito 20         --mito_prefixes MT-         --norm_count 10000         --enable_umap         --replicates 3

Command exit status:
  1

Command output:
  (empty)

Command error:
  [17:35:36 - workflow_glue] Bootstrapping CLI.
  /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1063: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
    @numba.jit()
  /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1071: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
    @numba.jit()
  /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1086: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
    @numba.jit()
  /home/epi2melabs/conda/lib/python3.8/site-packages/umap/umap_.py:660: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
    @numba.jit()
  [17:35:44 - workflow_glue] Starting entrypoint.
  [17:35:44 - workflow_glue.AggreMatri] Constructing count matrices
  [17:35:44 - workflow_glue.AggreMatri] Removing unknown features.
  [17:35:44 - workflow_glue.AggreMatri] Writing raw counts to file.
  [17:35:44 - workflow_glue.AggreMatri] Filtering, normalizing and log-transforming matrix.
  Traceback (most recent call last):
    File "/home/vbj167/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow-glue", line 7, in <module>
      cli()
    File "/home/vbj167/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/__init__.py", line 82, in cli
      args.func(args)
    File "/home/vbj167/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/process_matrix.py", line 121, in main
      matrix
    File "/home/vbj167/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/expression_matrix.py", line 248, in remove_cells_and_features
      self._remove_elements(feat_mask=feat_mask, cell_mask=cell_mask)
    File "/home/vbj167/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/expression_matrix.py", line 336, in _remove_elements
      raise ValueError(
  ValueError: All cells would be removed, try altering filter thresholds.

Work dir:
  /maps/datasets/weischenfeldt_lab-AUDIT/gbm/work/8a/904fb2bc72d2734b134ffb8518b92b

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details
WARN: Killing running tasks (1)

executor >  local (363)
[e2/ea4e22] process > checkBamHeaders (1)            [100%] 1 of 1 ✔
[94/12e7a8] process > fastcat (1)                    [100%] 1 of 1 ✔
[75/23d6aa] process > parse_kit_metadata (1)         [100%] 1 of 1, cached: 1 ✔
[b2/4492ee] process > pipeline:getVersions           [100%] 1 of 1, cached: 1 ✔
[d7/cafcdc] process > pipeline:getParams             [100%] 1 of 1, cached: 1 ✔
[4b/abc05e] process > pipeline:preprocess:call_pa... [100%] 1 of 1, cached: 1 ✔
[83/74c63b] process > pipeline:preprocess:build_m... [100%] 1 of 1, cached: 1 ✔
[25/5b2d00] process > pipeline:preprocess:call_ad... [100%] 109 of 109 ✔
[e0/fc5053] process > pipeline:preprocess:summari... [100%] 1 of 1 ✔
[d7/c03493] process > pipeline:process_bams:split... [100%] 1 of 1, cached: 1 ✔
[4e/c48f79] process > pipeline:process_bams:gener... [100%] 1 of 1 ✔
[4f/795c75] process > pipeline:process_bams:assig... [100%] 109 of 109 ✔
[f2/962e2d] process > pipeline:process_bams:cat_t... [100%] 1 of 1 ✔
[04/6fa8f2] process > pipeline:process_bams:merge... [100%] 1 of 1 ✔
[63/a17ea5] process > pipeline:process_bams:strin... [100%] 40 of 40 ✔
[9f/21f024] process > pipeline:process_bams:align... [100%] 40 of 40 ✔
[91/b27634] process > pipeline:process_bams:assig... [100%] 26 of 26 ✔
[1b/243a46] process > pipeline:process_bams:creat... [100%] 26 of 26 ✔
[8a/904fb2] process > pipeline:process_bams:proce... [ 50%] 1 of 2, failed: 1
[e9/ebb75e] process > pipeline:process_bams:merge... [100%] 1 of 1 ✔
[a8/38092e] process > pipeline:process_bams:combi... [100%] 1 of 1 ✔
[c3/b38426] process > pipeline:process_bams:tag_b... [100%] 1 of 1 ✔
[ea/565b39] process > pipeline:process_bams:umi_g... [100%] 1 of 1 ✔
[f2/8d9d71] process > pipeline:process_bams:pack_... [100%] 1 of 1 ✔
[-        ] process > pipeline:prepare_report_data   -
[-        ] process > pipeline:makeReport            -
ERROR ~ Error executing process > 'pipeline:process_bams:process_matrix (2)'

Caused by:
  Process `pipeline:process_bams:process_matrix (2)` terminated with an error exit status (1)

Command executed:

  export NUMBA_NUM_THREADS=1
  workflow-glue process_matrix         inputs/matrix*.hdf         --feature gene         --raw gene_raw_feature_bc_matrix         --processed gene_processed_feature_bc_matrix         --per_cell_mito gene.expression.mito-per-cell.tsv         --per_cell_expr gene.expression.mean-per-cell.tsv         --umap_tsv gene.expression.umap.tsv         --enable_filtering         --min_features 200         --min_cells 3         --max_mito 20         --mito_prefixes MT-         --norm_count 10000         --enable_umap         --replicates 3

Command exit status:
  1

Command output:
  (empty)

Command error:
  [17:35:36 - workflow_glue] Bootstrapping CLI.
  /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1063: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
    @numba.jit()
  /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1071: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
    @numba.jit()
  /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1086: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
    @numba.jit()
  /home/epi2melabs/conda/lib/python3.8/site-packages/umap/umap_.py:660: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
    @numba.jit()
  [17:35:44 - workflow_glue] Starting entrypoint.
  [17:35:44 - workflow_glue.AggreMatri] Constructing count matrices
  [17:35:44 - workflow_glue.AggreMatri] Removing unknown features.
  [17:35:44 - workflow_glue.AggreMatri] Writing raw counts to file.
  [17:35:44 - workflow_glue.AggreMatri] Filtering, normalizing and log-transforming matrix.
  Traceback (most recent call last):
    File "/home/vbj167/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow-glue", line 7, in <module>
      cli()
    File "/home/vbj167/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/__init__.py", line 82, in cli
      args.func(args)
    File "/home/vbj167/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/process_matrix.py", line 121, in main
      matrix
    File "/home/vbj167/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/expression_matrix.py", line 248, in remove_cells_and_features
      self._remove_elements(feat_mask=feat_mask, cell_mask=cell_mask)
    File "/home/vbj167/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/expression_matrix.py", line 336, in _remove_elements
      raise ValueError(
  ValueError: All cells would be removed, try altering filter thresholds.

Work dir:
  /maps/datasets/weischenfeldt_lab-AUDIT/gbm/work/8a/904fb2bc72d2734b134ffb8518b92b

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

No response

@nrhorner
Copy link
Contributor

Hi @Josephinedh

The BAM from the first run will have been stripped of adapter sequences and so the vast majority of reads will be discarded.
Have you considered merging the counts matrices from each run?

@Josephinedh
Copy link
Author

Hi @nrhorner

Thanks for your reply, that makes sense. I wanted to avoid merging the count matrices, as that would disrupt the barcode calling, especially when only the filtered count matrices are outputted.

For now, I ran the pipeline using both fastqs as input, which works.
Would still be very nice if the pipeline could output the raw count matrices, so these could be merged directly and ambient RNA could be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants