Skip to content

Annocript1.1beta

Pre-release
Pre-release
Compare
Choose a tag to compare
@frankMusacchia frankMusacchia released this 12 Aug 13:01
· 1 commit to master since this release

1.1 Release Candidate

  • Corrected a behavior of Annocript. Before it was possible to use a different database in the same session that was generating problems. Now, Annocript can see which database was used before and use that for all the successive analyses.
  • Solved a problem with pathway descriptions that sometimes have a dot at the end of the string. This dot were coming out in the final plots.
  • added a check of the fasta file to alert the user whenever it is found a sequence with length zero.
  • Modified the source to take the IUPAC allowed characters. Now it is from the ncbi website: http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml. Moreover, now Annocript let the user to choose to continue if some problem is found.
  • Reduced the computational time by inserting the possibility to not use the GFF output.
  • Added the column OSNameSP and OSNameUf/OSNameTR. Now each result contains its own organism while the GO terms are assigned to the best result between SwissProt and TrEMBL/UniRef. Moreover two plots will be generated for both of the closer organisms table.
  • Some statistics are added in the HTML page: number of results in SwissProt, UniRef/TrEMBL, CDD and SILVA databases.
  • Pathways columns modified. Now if more patwhays are assigned to a transcript there is no confusion of which of them may not have the second or a the third level. If a level is not present I print a '-' so that the list of level of different pathways is reconducible to the exact one. Before if two pathways were assigned to the same transcript, the level were mixed and there was no possibility to link the level to the pathway. (e.g. pwl1_1]---[pwl1_2; - ]---[ pwl2_2 means that the first pathway associated does not contains a 2nd level description). In order to also get a good statistic, I modified also the method with which Annocript counts the occurrences of pathways.
  • GO terms for domains columns modified. Before we were separating all the GO terms assigned to a certain domain with a ';' and the groups with ']---['. But this separation could lead to redundant GO terms in the same column. Now for each transcript the GO columns contain all the GO terms separated by ']---[' and there is no distinction among domains and no redundance. The latter is important for the statistics using the table.
  • Substituted a system command, used during the database generation (cut), with PERL code. This command was causing an huge usage of memory on some system that caused the kill of the process.