Changes

From Genome Analysis Wiki
Jump to navigationJump to search
14 bytes removed ,  03:51, 12 October 2010
Line 24: Line 24:  
Beta-version available upon request (cfuchsb@umich.edu or goncalo@umich.edu)
 
Beta-version available upon request (cfuchsb@umich.edu or goncalo@umich.edu)
   −
== Getting Started ==
+
= Getting Started =
    
Using minimac for genotype imputation involves two steps. First, you will have to estimate haplotypes for your entire sample -- this will be the more computationally demanding step. Once that is done, you will be ready to quickly impute missing genotypes using the reference panel of your choice.  
 
Using minimac for genotype imputation involves two steps. First, you will have to estimate haplotypes for your entire sample -- this will be the more computationally demanding step. Once that is done, you will be ready to quickly impute missing genotypes using the reference panel of your choice.  
   −
=== Estimating Haplotypes for Your Sample ===
+
== Estimating Haplotypes for Your Sample ==
    
For the haplotyping step, we current recommend using [[MaCH]] with the --phase command line option. As input [[MaCH]] will need [[Merlin]] format pedigree and data files. All markers should be ordered according to their physical position and alleles should be labeled on the forward strand.  
 
For the haplotyping step, we current recommend using [[MaCH]] with the --phase command line option. As input [[MaCH]] will need [[Merlin]] format pedigree and data files. All markers should be ordered according to their physical position and alleles should be labeled on the forward strand.  
   −
==== Your Own Data ====
+
=== Your Own Data ===
    
To get started, you will need to store your data in [[Merlin]] format pedigree and data files, one per chromosome. For details, of the Merlin file format, see the [http://www.sph.umich.edu/csg/abecasis/Merlin/tour/input_files.html Merlin Tutorial].  
 
To get started, you will need to store your data in [[Merlin]] format pedigree and data files, one per chromosome. For details, of the Merlin file format, see the [http://www.sph.umich.edu/csg/abecasis/Merlin/tour/input_files.html Merlin Tutorial].  
Line 42: Line 42:  
If figuring out position and strand for each marker seems like hard work, don't despair. For you, this should be the hardest bit of the entire process! For the computer, the fun is about to start.
 
If figuring out position and strand for each marker seems like hard work, don't despair. For you, this should be the hardest bit of the entire process! For the computer, the fun is about to start.
   −
==== Running MaCH ====
+
=== Running MaCH ===
    
A typical MaCH command line to estimate phased haplotypes might look like this:
 
A typical MaCH command line to estimate phased haplotypes might look like this:
Line 77: Line 77:  
|}
 
|}
   −
=== Imputation into Phased Haplotypes ===
+
== Imputation into Phased Haplotypes ==
    
Imputing genotypes using '''minimac''' is an easy straightforward process: after selecting a set of reference haplotypes, plugging-in the target haplotypes from the previous step and setting the number of rounds to use for the model parameter estimation, imputation should proceed rapidly.
 
Imputing genotypes using '''minimac''' is an easy straightforward process: after selecting a set of reference haplotypes, plugging-in the target haplotypes from the previous step and setting the number of rounds to use for the model parameter estimation, imputation should proceed rapidly.
   −
==== Running Minimac ====
+
=== Running Minimac ===
    
A typical minimac command line might look like this:
 
A typical minimac command line might look like this:
Line 116: Line 116:  
|}
 
|}
   −
==== Reference Haplotypes ====
+
=== Reference Haplotypes ===
    
Reference haplotypes generated by the 1000 Genomes project and formatted so that they are ready for analysis are available from the [http://www.sph.umich.edu/csg/abecasis/MACH/download/1000G-2010-06.html MaCH download page]. The most recent set of haplotypes were generated in June 2010 by combining genotype calls generated at the Broad, Sanger and the University of Michigan. In our hands, this June 2010 release is substantially better than previous 1000 Genome Project genotype call sets.
 
Reference haplotypes generated by the 1000 Genomes project and formatted so that they are ready for analysis are available from the [http://www.sph.umich.edu/csg/abecasis/MACH/download/1000G-2010-06.html MaCH download page]. The most recent set of haplotypes were generated in June 2010 by combining genotype calls generated at the Broad, Sanger and the University of Michigan. In our hands, this June 2010 release is substantially better than previous 1000 Genome Project genotype call sets.

Navigation menu