Changes

From Genome Analysis Wiki
Jump to: navigation, search

Minimac: 1000 Genomes Imputation Cookbook

No change in size, 11:35, 26 July 2013
Your Own Data
Within each file, markers should be stored by chromosome position. Alleles should be stored in the forward strand and can be encoded as 'A', 'C', 'G' or 'T' (there is no need to use numeric identifiers for each allele).
The latest reference panel generated by the 1000 Genomes project uses NCBI Build 37 (HG 19). Make sure that your data is on Build 37 (or Minimac may ignore genotyped markers whose names have changed in build Build 37). If you are trying to convert your data from an earlier genome build to build Build 37, you'll probably find the [ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/database/organism_data/RsMergeArch.bcp.gz dbSNP merge table] ([http://www.ncbi.nlm.nih.gov/SNP/snp_db_table_description.cgi?t=RsMergeArch table description on the NCBI website]), which logs rs# changes between dbSNP builds, and the UCSC online [http://genome.ucsc.edu/cgi-bin/hgLiftOver liftOver tool], which converts genome positions between different genome builds, to be quite useful. We have also documented ([[LiftOver | Link]]) a general procedure to convert genome positions and rs number between builds.
If you are planning to use imputation with the MetaboChip, you might find a list of SNPs whose order varies between NCBI genome build Build 36 and 37 convenient. Here it is: [http://www.sph.umich.edu/csg/cfuchsb/metab_order_changed.txt List of Metabochip SNPs Whose Order Changes With Build]
Note: For the most recent reference panel in VCF format by default GWAS SNPs are expected to be in the chr:pos format e.g. 1:1000; otherwise, for GWAS SNPs in the rs format you have to set the --rs flag
369
edits

Navigation menu