A simple script to map SNP locations to known OMIM disease loci.
To map SNP locations to diseases the user must download several files from online repositories:
- A chr_rpts file from ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/chr_rpts/
- An OmimVarLocusIdSNP.bcp file from ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/database/organism_data/OmimVarLocusIdSNP.bcp.gz
- A morbidmap file from http://www.omim.org/downloads. This file requires registration to download (and thus why I cannot include the file in the repo.)
SNPs are then passed to the snp2Disease function as a two column array. The first column stores the chromosome number as per dbSNP syntax (1-24), and the second column contains the nucleotide base pair notation of the SNP on the chromosome.
The function outputs a dictionary with three keys:
- 'matchedSNPs' : A dictionary of chromosome numbers : A dictionary of chromosome nucleotide locations : A disease description
- 'noOmimSNPs' : A dictionary of chromosome numbers : A list of chromosome locations with no associated OMIM diseases
- 'noRsSNPs' : A dictionary of chromosome numbers : A list of chromosome locations with no associated rs identifiers