Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SemiBin doesnt complete bining. Seems to be stuck #150

Open
HitMonk opened this issue Jan 24, 2024 · 4 comments
Open

SemiBin doesnt complete bining. Seems to be stuck #150

HitMonk opened this issue Jan 24, 2024 · 4 comments

Comments

@HitMonk
Copy link

HitMonk commented Jan 24, 2024

Hello,
So I have been trying to run SemiBin2 on an assembled set of contigs. however, it seems that after generating the training model there is no progress made. I left the program run for 2 days and still do not see any new files being created. The processors also dont seem to be doing anything. I also ran it again with the generated model, and the same thing happens. The one thing that i can think might be an issue is memory. As the program starts the binning part, the memory consumption seems to increase until all the memory is used up and then it seems to be stuck. Could this be an issue. Can someone give me a hint of what i could look for or change to correct the issues?
Command for the run
SemiBin2 single_easy_bin --input-fasta Step1_Assembly/bowtie_index/final.contigs.fa --input-bam Step2_binning/inputs_Semibin/contig.mapped.sorted.bam --output Step2_binning/SemiBin_output

Log for semibin run

[2024-01-22 09:22:59,172] INFO: Setting number of CPUs to 40
[2024-01-22 09:22:59,174] INFO: Binning for short_read
[2024-01-22 09:22:59,175] INFO: SemiBin will run in self supervised mode
[2024-01-22 09:25:47,437] INFO: Binning for short_read
[2024-01-22 09:25:47,439] INFO: SemiBin will run in self supervised mode
[2024-01-22 09:26:05,682] INFO: Did not detect GPU, using CPU.
[2024-01-22 09:26:56,651] INFO: Generating training data...
[2024-01-22 09:27:03,431] INFO: Calculating coverage for every sample.
[2024-01-22 10:27:06,763] INFO: Processed: Step2_binning/inputs_Semibin/contig.mapped.sorted.bam
[2024-01-22 10:36:54,882] INFO: Start training from one sample.
[2024-01-22 10:37:13,932] INFO: Training model...
[2024-01-22 10:37:18,480] INFO: Generate training data of 0:
[2024-01-22 11:33:14,668] INFO: Training finished.
[2024-01-22 11:33:15,915] INFO: Start binning.

Output files generated:

contig.mapped.sorted.bam_0_data_cov.csv
data.csv
data_split.csv
model.h5
SemiBinRun.log

Picture showing processes not using CPUs and memory being maxed out.
image

@psj1997
Copy link
Collaborator

psj1997 commented Jan 30, 2024

Hi,

Can you help to check how many contigs used in binning(the size of the data.csv)?Thanks!

@HitMonk
Copy link
Author

HitMonk commented Jan 31, 2024

Hello it seems like the size is 375Mb

image

I have a total of 6792371 assembled contigs that im trying to bin.

@Tsingsjeen
Copy link

Did this solved? I am also waiting for the final binning....., which already take 12 hours, I have the rough 50 bacterial bins produced by maxbin2, so how much hours will take?

@apcamargo
Copy link

I am also experiencing this issue with single_easy_bin in version 2.1.0. It seems that this only affects large assemblies, as the runs for small assemblies finished without an issue, while the large assemblies could not be processed.

The contigs in our assemblies are ≥1 kb and we mapped the reads with strobealign and sorted them. Everything we are doing is standard, except that we are running SemiBin2 through Apptainer.

apptainer pull semibin.sif docker://quay.io/biocontainers/semibin:2.1.0--pyhdfd78af_0

apptainer exec semibin.sif SemiBin2 single_easy_bin \
    -i binning_assemblies/${SAMPLE}.fna.gz \
    -b mapping_binning/${SAMPLE}/*.bam \
    -o semibin2_output/${SAMPLE}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants