From Genome Analysis Wiki
LiftOver can have three use cases:
(1) [[#Lift genome positions |
convert genome position from one genome assembly to another genome assembly]]
In most scenarios, we have known genome positions in NCBI build 36 (UCSC hg 18) and hope to lift them over to NCBI build 37 (UCSC hg19).
(2) [[#Lift dbSNP rs numbers |
convert dbSNP rs number from one build to another ]]
It is likely to see such type of data in Merlin/PLINK format.
With the above in mind, we are able to combine these two tables to obtain the relationship between older rs number and new rs number.
We have developed a script (for internal use), named [http://genome.sph.umich.edu/wiki/LiftRsNumber.py liftRsNumber.py]
AAAA for lift rs numbers between builds.
This scripts require RsMergeArch.bcp.gz and SNPHistory.bcp.gz, those can be found in [[#Resources | Resources]].
3115860 unchanged 12124819lifted 2229002 lifted 1130683
Similar to the human reference build, dbSNP also have different versions. You may consider change rs number from the old dbSNP version to new dbSNP version
depending on your needs. Such steps are described in [[#Lift dbSNP rs numbers | Lift dbSNP rs numbers]].
==== Method 2 ====
(2) Lookup SNP positions from rs number
dbSNP provides a file [[#Resources |
joinb132_SNPChrPosOnRef_37_1.bcp.gz]] which contains rsNumber, chromosome and its position.
Use this file along with the new rsNumber obtained in the first step.
In practice, some rs numbers do not exist in build 132, or not suitable to be considered ( e.g. they do not reside on human reference, or they are mapped to multiple locations, these scenarios are noted by the chromosome column with values like "AltOnly", "Multi", "NotOn", "PAR", "Un"), we can drop them in the liftover procedure.
==== Method 3 ====
NCBI dbSNP team has provided a [[ #Resources | provisional map ]]
AAAA for converting the genome position of a larget set dbSNP from NCBI build 36 to NCBI build 37.
In the second step, we have obtained unlifted genome positions, so we can try to use the table to convert those unlfted dbSNPs.
After this step, there are still some SNPs that cannot be lifted, as they are mostly located on non-reference chromosome.
== Resources ==
* liftRsNumber.py [[liftRsNumber.py]] and its interal location: /net/
* liftMap.py [[liftMap.py]]
* NCBI provisional map [ftp://ftp.ncbi.nih.gov:/snp/organisms/human_9606/misc/exchange/Remap_36_3_37_1.txt.gz file] and [ftp://ftp.ncbi.nih.gov:/snp/organisms/human_9606/misc/exchange/Remap_36_3_37_1.info info]
* NCBI RgMergeArch [ftp://ftp.ncbi.nih.gov:/snp/organisms/human_9606/database/organism_data/RsMergeArch.bcp.gz file] and [http://www.ncbi.nlm.nih.gov/SNP/snp_db_table_description.cgi?t=RsMergeArch schema]
* NCBI SNPHistory [ftp://ftp.ncbi.nih.gov:/snp/organisms/human_9606/database/organism_data/SNPHistory.bcp.gz file] and [http://www.ncbi.nlm.nih.gov/SNP/snp_db_table_description.cgi?t=SNPHistory schema]
* NCBI SNPChrPosOnRef build 132 [
ftp:// ftp.ncbi. nih. gov:/snp/organisms/human_9606/ database/ b132_archive/ organism_data/b132_SNPChrPosOnRef_37_1.bcp.gz file] and [http://www.ncbi.nlm.nih.gov/SNP/snp_db_table_description.cgi?t=SNPChrPosOnRef schema]
* How UCSC dbSNP differs from NCBI dbSNP [http://genomewiki.ucsc.edu/index.php/DbSNP_Track_Notes UCSC dbSNP track note]
* The dbSNP mapping process [http://www.ncbi.nlm.nih.gov/books/NBK44455/ link]
* NCBI dbSNP release 132 [ftp://ftp.ncbi.nih.gov:/snp/organisms/human_9606/VCF/v4.0/ByChromosomeNoGeno/00-All.vcf.gz 00-All.vcf.gz]
* UCSC dbSNP release 132 [http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/snp132.txt.gz snp132.txt.gz]
== Acknowledge ==
Please contact [mailto:email@example.com Xiaowei Zhan].