Faster create_pseudobulks function. #5

ghuls · 2023-12-13T15:49:46Z

Faster create_pseudobulks function, by assuming that a lot of consecutive values will have the same value. Also streches of zeros are not stored at all.

Based on:
https://github.com/aertslab/single_cell_toolkit/blob/c15038ddf1322fd4957396c16bb7782ad2e6629e/fragments_to_bw.py#L263C1-L372C82

Faster create_pseudobulks function, by assuming that a lot of consecutive values will have the same value. Also streches of zeros are not stored at all. Based on: https://github.com/aertslab/single_cell_toolkit/blob/c15038ddf1322fd4957396c16bb7782ad2e6629e/fragments_to_bw.py#L263C1-L372C82

jmschrei · 2023-12-13T21:15:41Z

Thanks. I'll take a look soon. How much faster is it? Can you add in the documentation a reference to the code in single_cell_toolkit?

ghuls · 2023-12-14T09:44:49Z

It depends on how many ranges of the same value you have in a row.
For fragments_to_bw it was at least 10 times faster, if I remember correctly.

Do you have a pseudobulk bw file I can test with (as your distribution of values might be different than what I tested with in fragments_to_bw?

jmschrei · 2023-12-15T16:45:28Z

Thanks for looking into it. Any of the fragments files under scATAC_clusters.zip should be good for testing: https://zenodo.org/records/8313962

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster create_pseudobulks function. #5

Faster create_pseudobulks function. #5

ghuls commented Dec 13, 2023

jmschrei commented Dec 13, 2023

ghuls commented Dec 14, 2023

jmschrei commented Dec 15, 2023

Faster create_pseudobulks function. #5

Are you sure you want to change the base?

Faster create_pseudobulks function. #5

Conversation

ghuls commented Dec 13, 2023

jmschrei commented Dec 13, 2023

ghuls commented Dec 14, 2023

jmschrei commented Dec 15, 2023