Changes

From Genome Analysis Wiki
Jump to: navigation, search

Minimac

14 bytes removed, 03:51, 12 October 2010
Getting Started
Beta-version available upon request (cfuchsb@umich.edu or goncalo@umich.edu)
== Getting Started ==
Using minimac for genotype imputation involves two steps. First, you will have to estimate haplotypes for your entire sample -- this will be the more computationally demanding step. Once that is done, you will be ready to quickly impute missing genotypes using the reference panel of your choice.
=== Estimating Haplotypes for Your Sample ===
For the haplotyping step, we current recommend using [[MaCH]] with the --phase command line option. As input [[MaCH]] will need [[Merlin]] format pedigree and data files. All markers should be ordered according to their physical position and alleles should be labeled on the forward strand.
==== Your Own Data ====
To get started, you will need to store your data in [[Merlin]] format pedigree and data files, one per chromosome. For details, of the Merlin file format, see the [http://www.sph.umich.edu/csg/abecasis/Merlin/tour/input_files.html Merlin Tutorial].
If figuring out position and strand for each marker seems like hard work, don't despair. For you, this should be the hardest bit of the entire process! For the computer, the fun is about to start.
==== Running MaCH ====
A typical MaCH command line to estimate phased haplotypes might look like this:
|}
=== Imputation into Phased Haplotypes ===
Imputing genotypes using '''minimac''' is an easy straightforward process: after selecting a set of reference haplotypes, plugging-in the target haplotypes from the previous step and setting the number of rounds to use for the model parameter estimation, imputation should proceed rapidly.
==== Running Minimac ====
A typical minimac command line might look like this:
|}
==== Reference Haplotypes ====
Reference haplotypes generated by the 1000 Genomes project and formatted so that they are ready for analysis are available from the [http://www.sph.umich.edu/csg/abecasis/MACH/download/1000G-2010-06.html MaCH download page]. The most recent set of haplotypes were generated in June 2010 by combining genotype calls generated at the Broad, Sanger and the University of Michigan. In our hands, this June 2010 release is substantially better than previous 1000 Genome Project genotype call sets.

Navigation menu