New release plan for Sunbeam 3.0 #307

kylebittinger · 2022-02-15T16:41:11Z

We have a new software developer for the PennCHOP Microbiome Program, Charlie Bushman (@Ulthran), which presents a new opportunity for us to devote some serious attention to Sunbeam. I'd like to renew our push for a release of Sunbeam 3.0.

Current/past lead developers @louiejtaylor @eclarke @ressy @zhaoc1 -- if you have thoughts on features, fixes, and changes that should be included in this release, please let us know so we can have this stuff on our radar. If you have no opinion, that's cool too.

I'd also like to flag members of the CHOP Microbiome Center, @ctanes @scottdaniel @vitu1 @WeimingWHu so they can contribute their thoughts.

Hope all of you are doing well. Feel free to reach out by email if needed.

zhaoc1 · 2022-02-15T17:18:08Z

That's good news! Good luck with Sunbeam 3.0!

ressy · 2022-02-17T21:19:20Z

Glad to hear it! No particular opinions from me-- just here reiterating Chunyu's message. It'll be good for the package to get some TLC. I keep hoping I could hop in and help clear out some issues but just never get the chance.

zhaoc1 · 2022-02-17T21:51:53Z

I recommend one of the recent ultra-fast meta-genotyping tool published in Nature Biotechnology (github repo) - GT-Pro.

I think it serves the goal of Sunbeam as a metagenomic sequencing pipeline very well, and expand the capacity of Sunbeam to strain-level analysis.

louiejtaylor · 2022-02-23T16:54:51Z

Thanks for the ping and sorry for the slow response! I remember back before I left we had a list of things we wanted to do to wrap up version 3.0. Fortunately, we decided to make (or base these on) issues in the repo, so hopefully they should all be clear about what needs to be done. These may be of debatable importance half a year later, but here are the ones that are still outstanding:

High-priority (feature-related) issues:

QC summary doesn't include information on komplexity-filtered reads #252
flagstats in mapping #273
Any way to remove temporary files for all_decontam? #275
making a release for v3.0 and transitioning that to the stable branch. The existing releases can be a model for this

A decent summary of the already-completed differences between 2.1.0 (stable) and 3.0 (dev) can be found in the changelog. There's a ton of good stuff already done, like automated extension installing and config updating, new tool versions, and more flexibility for the user in configuring the pipeline. For previous releases, we also made sure the automated tests passed and ran through the other outstanding issues as well. I echo Jesse's sentiment--I'd love to jump in and help but don't have the bandwidth.

There were a few other potential improvements we were thinking about that might be nice for future versions but aren't necessarily required for 3.0:

Updating to Snakemake 5.8 #263: this would be nice to take advantages of improvements in snakemake, but snakemake updating to let you pass an arbitrary number of config files instead of just one wreaked some havoc with our argument parsing, if I remember correctly
Making sunbeam install-able via conda would be really nice, but doesn't seem trivial!

Hope this is helpful--best of luck!

ressy · 2022-02-23T19:57:03Z

Thanks a lot for pulling together that summary, Louis.

I came across this paper today, "Sustained software development, not number of citations or journal choice, is indicative of accurate bioinformatic software."

In addition we suggest that further efforts be made to encourage continual updates to software tools. To paraphrase some of the suggestions of Siepel (2019), these efforts may include more secure positions for developers, institutional promotion criteria include software maintenance, lower publication barriers for significant software updates, encourage further funding for software maintenance and improvement—not just new tools [55]. If these issues were recognised by research managers, funders and reviewers, then perhaps the future bioinformatic software tool landscape will be much improved.

I'll second that!

levlitichev · 2022-04-29T22:09:00Z

Exciting to hear that a new release of Sunbeam is planned! I've been working a lot with Sunbeam lately (thank you!) and have a few suggestions:

I find that there is a lot of overhead (e.g. searching for extensions, checking file paths, etc.) that makes Sunbeam pretty slow. For me, Sunbeam takes ~30 seconds to get to say "Building DAG of jobs..." , which makes quick troubleshooting difficult. It would be nice to speed this up, but I unfortunately don't have any concrete suggestions because I don't know which steps are slow.
I think cutadapt shouldn't by default throw out reads that have adapters removed (see issue Why remove trimmed reads? #288).
I use Sunbeam in conjunction with an LSF Snakemake profile on HPC. I had to modify the rule for Kraken in order to give that one job a lot more memory than the other jobs. It could be a nice parameter to add to the config file.
Regarding Any way to remove temporary files for all_decontam? #275 above, I actually WANT to keep the host reads. I am mapping host reads to genotyped mice in order to ensure there were no sample mix-ups. It's not ready for primetime, but I've made this functionality into a Sunbeam extension. In brief, it would be nice to have an option to keep the host read bam file.
I don't use the assembly, annotation, or mapping modules, so it would be nice not having those as core parts of Sunbeam. I saw that there are a few issues already about separating out other components of Sunbeam, like taxonomic classification. I think modularization is generally the right idea.
The extension I use all the time is sbx_gene_clusters for functional classification of my taxonomic reads directly. I just made a PR with some small suggestions, but I think building out this extension and making it more prominent throughout the documentation could be worthwhile.

Happy to chat more about this, and apologies if I'm jumping into this conversation without the right context. For points 2, 3, and 4 above, it'd be easy to add my local changes as a PR.

Thanks again for a nice tool, and I'm excited to hear that it will be getting some TLC!

Ulthran · 2022-05-03T15:24:54Z

Hi @levlitichev, thanks for the feedback! User input on what to fix and what to add is super useful. At the moment I'm working mostly on upgrading dependencies and separating each functional unit (eventually to get to the point where 5. is possible) and a few features that have been asked for a lot. I'd love to talk more with you about your suggestions in the near future though.

Thanks,
Charlie

Ulthran · 2023-08-16T17:14:00Z

Hi again @levlitichev, I think a lot of what you mentioned has now been integrated into sunbeam as of the latest v4.0.0 release. I'm going to close this issue but if there are any parts of it that you want to open again or new suggestions please open a new issue(s). Would love to hear your thoughts on where sunbeam is at now and where it should go.

Thanks,
Charlie

levlitichev · 2023-08-17T13:41:06Z

Awesome, thanks for the update! I'll update to the current release when I next have to use Sunbeam, and I'll let you know how it goes. Thanks again for your efforts!

Ulthran self-assigned this Dec 13, 2022

Ulthran closed this as completed Aug 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New release plan for Sunbeam 3.0 #307

New release plan for Sunbeam 3.0 #307

kylebittinger commented Feb 15, 2022

zhaoc1 commented Feb 15, 2022

ressy commented Feb 17, 2022

zhaoc1 commented Feb 17, 2022 •

edited

Loading

louiejtaylor commented Feb 23, 2022

ressy commented Feb 23, 2022

levlitichev commented Apr 29, 2022

Ulthran commented May 3, 2022

Ulthran commented Aug 16, 2023

levlitichev commented Aug 17, 2023

New release plan for Sunbeam 3.0 #307

New release plan for Sunbeam 3.0 #307

Comments

kylebittinger commented Feb 15, 2022

zhaoc1 commented Feb 15, 2022

ressy commented Feb 17, 2022

zhaoc1 commented Feb 17, 2022 • edited Loading

louiejtaylor commented Feb 23, 2022

ressy commented Feb 23, 2022

levlitichev commented Apr 29, 2022

Ulthran commented May 3, 2022

Ulthran commented Aug 16, 2023

levlitichev commented Aug 17, 2023

zhaoc1 commented Feb 17, 2022 •

edited

Loading