Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/prokka in annotation #171

Merged
merged 16 commits into from
Jun 7, 2022
Merged

Feature/prokka in annotation #171

merged 16 commits into from
Jun 7, 2022

Conversation

nkleinbo
Copy link
Collaborator

Prokka is replaced by prodigal in the annotation module. In magAttributes, prokka is no longer used since the annotation of bins now takes place in the annotation module as well. The main workflow wPipeline is changed accordingly.

The Workflwo wCMSeqWorkflowFile in the magAttributes-module is also changed to use prokka, but there seems to be a problem with the CMSeq image. See Issue #170

Issue #164 has been resolved in this branch as well.

The prokka version that is used is a custom image that uses an older prokka version with a fix to support partialgenes. We should update the current prokka version with these changes and include that in our custom image and then evaluate, if we should replace prokka with bacta, which includes partialgenes support without further modifications to the source code.

@nkleinbo nkleinbo requested a review from bosterholz April 29, 2022 09:45
Copy link
Collaborator

@bosterholz bosterholz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thanks!
The Prokka mode transfer seems to be a little bit complicated.
Aside from that everything looks fine.

@@ -282,7 +231,7 @@ process pKEGGFromDiamond {
pattern: "{**.tsv}"

// UID mapping does not work for some reason. Every time a database directory is created while running docker,
// the permissions are set to root. This leads to crashes later on.
// the permissions are set to root. This leasds to crashes later on.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New typo "leads"

* - "meta" for smaller contigs / metagenomes
* - "param" sets prodigalMode to whatever is defined as params.steps.annotation.prokka.prodigalMode
* Other input is a tuple consisting of sample, binID, path to fasta file and the domain (for gene prediction)
* locusTag setzen? Eindeutiger Tag den wir ersetzen koennen
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we open an issue for this?

time '5h'

publishDir params.output, saveAs: { filename -> getOutput("${sample}",params.runid ,"prokka", filename) }, \
pattern: "{**.gff.gz,**.fna.gz,**.faa.gz,**.sqn.gz,**.txt,**.tsv,**.fsa.gz,**.ffn.gz,**.gbk.gz,**.tbl.gz}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this pattern necessary. You need all outputs and you are forgetting the command files.

for f in out/* ; do suffix=$(echo "${f##*.}"); mv $f ${BIN_PREFIX}.${suffix}; done
sed -i -e "2,$ s/^/!{sample}\t${BIN_ID}\t/" -e "1,1 s/^/SAMPLE\tBIN_ID\t/g" *.tsv
mv *.tsv !{sample}_prokka_${BIN_ID}.tsv
gzip --best *gff *.faa *.fna *.ffn *.fsa *.gbk *.sqn *tbl
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use pigz because it ships with the toolkit?
There could also be a little speed boost if files get bigger.

@bosterholz bosterholz merged commit 90ae6af into dev Jun 7, 2022
pbelmann pushed a commit that referenced this pull request Jun 29, 2022
* feat(annotation): replaced prodigal with prokka in annotation module

* feat(annotation): removed prokka, moved to annotation module

* feat(annotation): prokka moved from magAttributes to annotation and replaces prodigal

* feat(annotation): prokka moved from magAttributes to annotation and replaces prodigal

* feat(annotation): removed duplicate prokka container and set to own container

* feat(annotation): prokka moved from magAttributes to annotation module and replaces prodigal, annotation module now gets gtdb results for prokka call

* feat(annotation): replaced prokka 1.12 with prokka 1.14.6

* feat(annotation): added tsv output which is available again in prokka 1.14.6

* feat(annotation): Re-added documentation for _wAnnotation 

... that somehow got lost during a merge.

* feat(annotation): Removed test output

Leftover from testing.

* fix(annotation): Removed containerOptions in pProkka

Not needed anymore since prokka version now includes the partial genes option.

* fix(annotation): correct indention

* fix(annotation): _wAnnotation missed parameter in wAnnotateFile

* fix(annotation): input.fasta should be removed after prokka execution

* fix(annotation): typo + review changes

Co-authored-by: bosterholz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants