A question about running getorganlle #236
Unanswered
Mingyu0110
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I used getorganelle to assemble the chloroplast genome. Run the command as follows:
get_organelle_from_reads.py -1 ../data/V1_1.fq.gz -2 ../data/V1_2.fq.gz -o V1 -R 30 -k 21,45,65,85,105,127 -t 20 -F embplant_pt
Some samples were not obtained with complete chloroplast assembly.
![image](https://user-images.githubusercontent.com/125346494/218671752-591dced1-e415-4019-ac58-93d070e9b698.png)
Can you please tell me how to adjust the parameters to get complete chloroplast assembly?The log file is as follows:
GetOrganelle v1.7.7.0
get_organelle_from_reads.py assembles organelle genomes from genome skimming data.
Find updates in https://github.com/Kinggerm/GetOrganelle and see README.md for more information.
Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:41:03) [GCC 9.4.0]
PLATFORM: Linux lzu-MZ72-HB0-00 5.4.0-132-generic #148~18.04.1-Ubuntu SMP Mon Oct 24 20:41:14 UTC 2022 x86_64 x86_64
PYTHON LIBS: GetOrganelleLib 1.7.7.0; numpy 1.22.3; sympy 1.11.1; scipy 1.9.3
DEPENDENCIES: Bowtie2 2.4.5; SPAdes 3.15.5; Blast 2.11.0
GETORG_PATH=/home/limy/.GetOrganelle
SEED DB: embplant_pt 0.0.1; embplant_mt 0.0.1
LABEL DB: embplant_pt 0.0.1; embplant_mt 0.0.1
WORKING DIR: /mnt/sdb/limy/workspace/1.variant_calling/chloroplast3
//mnt/sdb/limy/bin/miniconda3/bin/get_organelle_from_reads.py -1 ../data/V1_1.fq.gz -2 ../data/V1_2.fq.gz -o V1 -R 30 -k 21,45,65,85,105,127 -t 20 -F embplant_pt
2023-02-14 14:27:03,272 - INFO: Pre-reading fastq ...
2023-02-14 14:27:03,272 - INFO: Estimating reads to use ... (to use all reads, set '--reduce-reads-for-coverage inf --max-reads inf')
2023-02-14 14:27:03,345 - INFO: Tasting 100000+100000 reads ...
2023-02-14 14:27:13,691 - INFO: Tasting 500000+500000 reads ...
2023-02-14 14:27:30,695 - INFO: Estimating reads to use finished.
2023-02-14 14:27:30,695 - INFO: Unzipping reads file: ../data/V1_1.fq.gz (4843326106 bytes)
2023-02-14 14:27:59,958 - INFO: Unzipping reads file: ../data/V1_2.fq.gz (5175943520 bytes)
2023-02-14 14:28:31,297 - INFO: Counting read qualities ...
2023-02-14 14:28:31,447 - INFO: Identified quality encoding format = Sanger
2023-02-14 14:28:31,448 - INFO: Phred offset = 33
2023-02-14 14:28:31,450 - INFO: Trimming bases with qualities (0.00%): 33..33 !
2023-02-14 14:28:31,499 - INFO: Mean error rate = 0.0026
2023-02-14 14:28:31,501 - INFO: Counting read lengths ...
2023-02-14 14:29:16,774 - INFO: Mean = 150.0 bp, maximum = 150 bp.
2023-02-14 14:29:16,774 - INFO: Reads used = 15000000+15000000
2023-02-14 14:29:16,774 - INFO: Pre-reading fastq finished.
2023-02-14 14:29:16,774 - INFO: Making seed reads ...
2023-02-14 14:29:16,775 - INFO: Seed bowtie2 index existed!
2023-02-14 14:29:16,775 - INFO: Mapping reads to seed bowtie2 index ...
2023-02-14 14:34:57,359 - INFO: Mapping finished.
2023-02-14 14:34:57,361 - INFO: Seed reads made: V1/seed/embplant_pt.initial.fq (65705782 bytes)
2023-02-14 14:34:57,361 - INFO: Making seed reads finished.
2023-02-14 14:34:57,361 - INFO: Checking seed reads and parameters ...
2023-02-14 14:34:57,361 - INFO: The automatically-estimated parameter(s) do not ensure the best choice(s).
2023-02-14 14:34:57,361 - INFO: If the result graph is not a circular organelle genome,
2023-02-14 14:34:57,361 - INFO: you could adjust the value(s) of '-w'/'-R' for another new run.
2023-02-14 14:35:06,259 - INFO: Pre-assembling mapped reads ...
2023-02-14 14:35:16,171 - INFO: Pre-assembling mapped reads finished.
2023-02-14 14:35:16,172 - INFO: Estimated embplant_pt-hitting base-coverage = 290.50
2023-02-14 14:35:16,442 - INFO: Estimated word size(s): 112
2023-02-14 14:35:16,443 - INFO: Setting '-w 112'
2023-02-14 14:35:16,443 - INFO: Setting '--max-extending-len inf'
2023-02-14 14:35:16,892 - INFO: Checking seed reads and parameters finished.
2023-02-14 14:35:16,893 - INFO: Making read index ...
2023-02-14 14:38:28,854 - INFO: 26054075 candidates in all 30000000 reads
2023-02-14 14:38:28,854 - INFO: Pre-grouping reads ...
2023-02-14 14:38:28,854 - INFO: Setting '--pre-w 112'
2023-02-14 14:38:31,625 - INFO: 200000/3396434 used/duplicated
2023-02-14 14:38:46,121 - INFO: 4275 groups made.
2023-02-14 14:38:49,639 - INFO: Making read index finished.
2023-02-14 14:38:49,639 - INFO: Extending ...
2023-02-14 14:38:49,639 - INFO: Adding initial words ...
2023-02-14 14:38:54,844 - INFO: AW 1600880
2023-02-14 14:41:13,063 - INFO: Round 1: 26054075/26054075 AI 206553 AW 1970418
2023-02-14 14:43:28,111 - INFO: Round 2: 26054075/26054075 AI 215489 AW 2019166
2023-02-14 14:45:43,049 - INFO: Round 3: 26054075/26054075 AI 216536 AW 2029506
2023-02-14 14:47:56,812 - INFO: Round 4: 26054075/26054075 AI 216561 AW 2030180
2023-02-14 14:50:10,902 - INFO: Round 5: 26054075/26054075 AI 216581 AW 2030482
2023-02-14 14:52:25,998 - INFO: Round 6: 26054075/26054075 AI 216581 AW 2030482
2023-02-14 14:52:25,998 - INFO: No more reads found and terminated ...
2023-02-14 14:52:58,777 - INFO: Extending finished.
2023-02-14 14:52:59,681 - INFO: Separating extended fastq file ...
2023-02-14 14:53:00,424 - INFO: Setting '-k 21,45,65,85,105,127'
2023-02-14 14:53:00,424 - INFO: Assembling using SPAdes ...
2023-02-14 14:53:00,476 - INFO: spades.py -t 20 --phred-offset 33 -1 V1/extended_1_paired.fq -2 V1/extended_2_paired.fq --s1 V1/extended_1_unpaired.fq --s2 V1/extended_2_unpaired.fq -k 21,45,65,85,105,127 -o V1/extended_spades
2023-02-14 14:53:51,072 - INFO: Insert size = 355.059, deviation = 73.8438, left quantile = 264, right quantile = 450
2023-02-14 14:53:51,073 - INFO: Assembling finished.
2023-02-14 14:54:00,154 - INFO: Slimming V1/extended_spades/K127/assembly_graph.fastg finished!
2023-02-14 14:54:00,154 - INFO: Slimming assembly graphs finished.
2023-02-14 14:54:00,154 - INFO: Extracting embplant_pt from the assemblies ...
2023-02-14 14:54:00,155 - INFO: Disentangling V1/extended_spades/K127/assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg as a circular genome ...
2023-02-14 14:54:00,503 - INFO: Disentangling failed: 'Incomplete/Complicated graph: please check around EDGE_333!'
2023-02-14 14:54:00,503 - INFO: Scaffolding disconnected contigs using SPAdes scaffolds ...
2023-02-14 14:54:00,503 - WARNING: Assembly based on scaffolding may not be as accurate as the ones directly exported from the assembly graph.
2023-02-14 14:54:00,504 - INFO: Disentangling V1/extended_spades/K127/assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg as a circular genome ...
2023-02-14 14:54:00,526 - WARNING: -200-bp gap/overlap between 369 and 351 indicated while conflicting connections existed!
2023-02-14 14:54:00,527 - WARNING: -100-bp gap/overlap between 361 and 4566 indicated while conflicting connections existed!
2023-02-14 14:54:00,783 - INFO: Disentangling failed: 'Incomplete/Complicated graph: please check around EDGE_333!'
2023-02-14 14:54:00,783 - INFO: Disentangling V1/extended_spades/K127/assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg as a/an embplant_pt-insufficient graph ...
2023-02-14 14:54:01,655 - INFO: Vertex_263 #copy = 2
2023-02-14 14:54:01,656 - INFO: Vertex_297 #copy = 1
2023-02-14 14:54:01,656 - INFO: Vertex_327 #copy = 1
2023-02-14 14:54:01,656 - INFO: Vertex_333 #copy = 1
2023-02-14 14:54:01,656 - INFO: Vertex_61 #copy = 1
2023-02-14 14:54:01,656 - INFO: Vertex_291 #copy = 1
2023-02-14 14:54:01,656 - INFO: Vertex_321_4442_253 #copy = 1
2023-02-14 14:54:01,656 - INFO: Vertex_4556 #copy = 1
2023-02-14 14:54:01,656 - INFO: Vertex_4664 #copy = 2
2023-02-14 14:54:01,656 - INFO: Vertex_315 #copy = 1
2023-02-14 14:54:01,656 - INFO: Vertex_367 #copy = 1
2023-02-14 14:54:01,656 - INFO: Vertex_4228 #copy = 1
2023-02-14 14:54:01,656 - INFO: Vertex_4780 #copy = 2
2023-02-14 14:54:01,656 - INFO: Vertex_5022 #copy = 1
2023-02-14 14:54:01,656 - INFO: Vertex_5026 #copy = 2
2023-02-14 14:54:01,656 - INFO: Vertex_7 #copy = 1
2023-02-14 14:54:01,656 - INFO: Average embplant_pt kmer-coverage = 56.3
2023-02-14 14:54:01,656 - INFO: Average embplant_pt base-coverage = 351.7
2023-02-14 14:54:01,656 - INFO: Writing output ...
2023-02-14 14:54:01,712 - WARNING: More than one structure (gene order) produced ...
2023-02-14 14:54:01,713 - WARNING: Please check the final result to confirm whether they are simply different in SSC direction (two flip-flop configurations)!
2023-02-14 14:54:01,714 - INFO: Writing PATH1 of embplant_pt scaffold(s) to V1/embplant_pt.K127.scaffolds.graph1.1.path_sequence.fasta
2023-02-14 14:54:01,716 - INFO: Writing PATH2 of embplant_pt scaffold(s) to V1/embplant_pt.K127.scaffolds.graph1.2.path_sequence.fasta
2023-02-14 14:54:01,716 - INFO: Writing GRAPH to V1/embplant_pt.K127.contigs.graph1.selected_graph.gfa
2023-02-14 14:54:01,716 - INFO: Result status of embplant_pt: 4 scaffold(s)
2023-02-14 14:54:01,872 - INFO: Writing output finished.
2023-02-14 14:54:01,872 - INFO: Please ...
2023-02-14 14:54:01,873 - INFO: load the graph file 'assembly_graph.fastg.extend-embplant_pt-embplant_mt.fastg' in K127
2023-02-14 14:54:01,873 - INFO: load the CSV file 'assembly_graph.fastg.extend-embplant_pt-embplant_mt.csv' in K127
2023-02-14 14:54:01,873 - INFO: visualize and confirm the incomplete result in Bandage.
2023-02-14 14:54:01,873 - INFO: If the result is nearly complete,
2023-02-14 14:54:01,873 - INFO: you can also adjust the arguments according to https://github.com/Kinggerm/GetOrganelle/wiki/FAQ#what-should-i-do-with-incomplete-resultbroken-assembly-graph
2023-02-14 14:54:01,873 - INFO: If you have questions for us, please provide us with the get_org.log.txt file and the post-slimming graph in the format you like!
2023-02-14 14:54:01,873 - INFO: Extracting embplant_pt from the assemblies finished.
Total cost 1623.70 s
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions