Changes

From Genome Analysis Wiki
Jump to: navigation, search

Minimac: 1000 Genomes Imputation Cookbook

3 bytes added, 11:17, 10 August 2011
Your Own Data
Within each file, markers should be stored by chromosome position. Alleles should be stored in the forward strand and can be encoded as 'A', 'C', 'G' or 'T' (there is no need to use numeric identifiers for each allele).
The latest reference panel generated by the 1000 Genomes project uses NCBI Build 37 (HG 19). Make sure that your data is on Build 37 (or Minimac may ignore genotyped markers whose names have changed in build 37). If you are trying to convert your data from an earlier genome build to build 37, you'll probably find the [ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/database/organism_data/RsMergeArch.bcp.gz dbSNP merge table] ([http://www.ncbi.nlm.nih.gov/SNP/snp_db_table_description.cgi?t=RsMergeArch table description on the NCBI website]), which logs rs# changes between dbSNP builds, and the UCSC online [http://genome.ucsc.edu/cgi-bin/hgLiftOver liftOver tool], which converts genome positions between different genome builds, to be quite useful. We have aslo documented [[LiftOver link]] a general procedure to convert genome positions and rs number between builds in [[LiftOver]].
If you are planning to use imputation with the MetaboChip, you might find a list of SNPs whose order varies between NCBI genome build 36 and 37 convenient. Here it is: [http://www.sph.umich.edu/csg/cfuchsb/metab_order_changed.txt List of Metabochip SNPs Whose Order Changes With Build]
255
edits

Navigation menu