You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running SemiBin2 single_easy_bin and experiencing premature finishing without output_bins. There was no error message. I used large size metagenome assemblies, and the assemblies were made from pacbio hifi long-read sequences. Some of runs successfully finished, but most of them could not create output_bins. Even though I put -t 30, it generally run on single thread. I could not see any memory issue on our server.
This command line I put in:
$ nohup singularity exec https://depot.galaxyproject.org/singularity/semibin:2.1.0--pyhdfd78af_0 SemiBin2 single_easy_bin -i /home/sung.shin/pool_2_metaMDBG/contigs.fasta -b pool2metaMDBG_aln.sort.bam -o pool2_metaMDBG_sb2_binning/ --self-supervised --sequencing-type=long_reads -t 30 --random-seed 123 2>semibin.log &
log messages I got:
$ cat semibin.log
nohup: ignoring input and appending output to 'nohup.out'
2025-01-03 11:22:08 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Binning for long_read
2025-01-03 11:22:12 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Did not detect GPU, using CPU.
2025-01-03 11:22:38 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Generating training data...
2025-01-03 15:17:52 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Calculating coverage for every sample.
2025-01-03 16:44:37 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Processed: pool2metaMDBG_aln.sort.bam
2025-01-03 16:52:53 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Start training from a single sample.
2025-01-03 16:53:42 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Training model...
100%|██████████| 15/15 [53:05<00:00, 212.34s/it]
2025-01-03 17:46:48 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Training finished.
2025-01-03 17:46:49 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Start binning.
2025-01-03 17:47:59 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Running naive ORF finder
[sung.shin@arsnecla0ap2 pool2_dastool]$ cd pool2_metaMDBG_sb2_binning/
$ cat SemiBinRun.log
[2025-01-03 11:22:08,400] INFO: Binning for long_read
[2025-01-03 11:22:12,494] INFO: Did not detect GPU, using CPU.
[2025-01-03 11:22:38,866] INFO: Generating training data...
[2025-01-03 15:17:52,924] INFO: Calculating coverage for every sample.
[2025-01-03 16:44:37,329] INFO: Processed: pool2metaMDBG_aln.sort.bam
[2025-01-03 16:52:53,075] INFO: Start training from a single sample.
[2025-01-03 16:53:42,830] INFO: Training model...
[2025-01-03 17:46:48,649] INFO: Training finished.
[2025-01-03 17:46:49,343] INFO: Start binning.
[2025-01-03 17:47:59,090] INFO: Running naive ORF finder
resulting files:
$ ls
data.csv data_split.csv markers.hmmout model.h5 pool2metaMDBG_aln.sort.bam_0_data_cov.csv SemiBinRun.log
Could you give me some advice?
The text was updated successfully, but these errors were encountered:
Is this issue can be possibly fixed by running binning step separately, when run was finished without output_bins?
I found "Advanced single-sample binning workflows" on https://semibin.readthedocs.io/en/latest/usage/.
What is difference between bin_short and bin_long?
SemiBin2 bin_short
-i S1.fa
--environment human_gut
--data S1_output/data.csv
-o S1_output
SemiBin2 bin_long
-i S1.fa
--environment human_gut
--data S1_output/data.csv
-o S1_output
I am running SemiBin2 single_easy_bin and experiencing premature finishing without output_bins. There was no error message. I used large size metagenome assemblies, and the assemblies were made from pacbio hifi long-read sequences. Some of runs successfully finished, but most of them could not create output_bins. Even though I put -t 30, it generally run on single thread. I could not see any memory issue on our server.
This command line I put in:
$ nohup singularity exec https://depot.galaxyproject.org/singularity/semibin:2.1.0--pyhdfd78af_0 SemiBin2 single_easy_bin -i /home/sung.shin/pool_2_metaMDBG/contigs.fasta -b pool2metaMDBG_aln.sort.bam -o pool2_metaMDBG_sb2_binning/ --self-supervised --sequencing-type=long_reads -t 30 --random-seed 123 2>semibin.log &
log messages I got:
$ cat semibin.log
nohup: ignoring input and appending output to 'nohup.out'
2025-01-03 11:22:08 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Binning for long_read
2025-01-03 11:22:12 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Did not detect GPU, using CPU.
2025-01-03 11:22:38 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Generating training data...
2025-01-03 15:17:52 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Calculating coverage for every sample.
2025-01-03 16:44:37 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Processed: pool2metaMDBG_aln.sort.bam
2025-01-03 16:52:53 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Start training from a single sample.
2025-01-03 16:53:42 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Training model...
100%|██████████| 15/15 [53:05<00:00, 212.34s/it]
2025-01-03 17:46:48 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Training finished.
2025-01-03 17:46:49 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Start binning.
2025-01-03 17:47:59 arsnecla0ap2.marc.usda.gov SemiBin[1485536] INFO Running naive ORF finder
[sung.shin@arsnecla0ap2 pool2_dastool]$ cd pool2_metaMDBG_sb2_binning/
$ cat SemiBinRun.log
[2025-01-03 11:22:08,400] INFO: Binning for long_read
[2025-01-03 11:22:12,494] INFO: Did not detect GPU, using CPU.
[2025-01-03 11:22:38,866] INFO: Generating training data...
[2025-01-03 15:17:52,924] INFO: Calculating coverage for every sample.
[2025-01-03 16:44:37,329] INFO: Processed: pool2metaMDBG_aln.sort.bam
[2025-01-03 16:52:53,075] INFO: Start training from a single sample.
[2025-01-03 16:53:42,830] INFO: Training model...
[2025-01-03 17:46:48,649] INFO: Training finished.
[2025-01-03 17:46:49,343] INFO: Start binning.
[2025-01-03 17:47:59,090] INFO: Running naive ORF finder
resulting files:
$ ls
data.csv data_split.csv markers.hmmout model.h5 pool2metaMDBG_aln.sort.bam_0_data_cov.csv SemiBinRun.log
Could you give me some advice?
The text was updated successfully, but these errors were encountered: