Get-Common-Genes-in-Multi-Species

This script is developed in python to find common genes in multi species. The script was tested on the results of "blastp" program.

Here our species is referred as "main_species.fa" and the other five species referred as species1.fa, species2.fa, species3.fa, species4.fa and species5.fa.

STEPS

Do the "blastp" of your species with other five species, you can follow the steps as described in "https://github.com/sgr308/blast_reciprocal" and get the results. or you can perform following simple blastp run for five species.

blastp -subject species1.fa -query main_species.fa -outfmt 6 -out blastresults_1.txt -num_threads 15 -max_target_seqs 1
blastp -subject species2.fa -query main_species.fa -outfmt 6 -out blastresults_2.txt -num_threads 15 -max_target_seqs 1
blastp -subject species3.fa -query main_species.fa -outfmt 6 -out blastresults_3.txt -num_threads 15 -max_target_seqs 1
blastp -subject species4.fa -query main_species.fa -outfmt 6 -out blastresults_4.txt -num_threads 15 -max_target_seqs 1
blastp -subject species5.fa -query main_species.fa -outfmt 6 -out blastresults_5.txt -num_threads 15 -max_target_seqs 1

Get id of each species.

awk '{print $1}' blastresults_1.txt > sp1.txt
awk '{print $1}' blastresults_2.txt > sp2.txt
awk '{print $1}' blastresults_3.txt > sp3.txt
awk '{print $1}' blastresults_4.txt > sp4.txt
awk '{print $1}' blastresults_5.txt > sp5.txt

Merge all files.

cat sp1.txt sp2.txt sp3.txt sp4.txt sp5.txt > all_id.txt

Remove duplicates.

awk '!T[$1]++' all_id.txt > Gene_id.txt

Edit python script "common_genes_in_multi_species.py" and enter the blastp output filenames for all five species i.e. edit "blastresults_1.txt" in line 8. Do it for other four filenames. Also enter ""Gene_id.txt" on line 63. Save it after making all changes.
Run "common_genes_in_multi_species.py" and you will get "result_common_genes_in_multi_species.txt" as a final output in which we get common Gene id hits of our species to all five other species.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
common_genes_in_multi_species.py		common_genes_in_multi_species.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Get-Common-Genes-in-Multi-Species

STEPS

About

Releases

Packages

Languages

sgr308/Get-Common-Genes-in-Multi-Species

Folders and files

Latest commit

History

Repository files navigation

Get-Common-Genes-in-Multi-Species

STEPS

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages