Changes

From Genome Analysis Wiki
Jump to navigationJump to search
463 bytes added ,  12:34, 25 January 2017
Line 1: Line 1: −
Before reading this tutorial, you might find it useful to spend a few minutes reading through the main [[Minimac]] documentation.  
+
Before reading this tutorial, you might find it useful to spend a few minutes reading through the main [[Minimac]] and [[Minimac2]] documentation.  
    
== Getting Started ==
 
== Getting Started ==
   −
Download [http://www.sph.umich.edu/csg/abecasis/MaCH/download/ MaCH] and [http://genome.sph.umich.edu/wiki/Minimac#Download Minimac]. Furthermore, example data used in this tutorial can be found [http://www.sph.umich.edu/csg/cfuchsb/minimac_example.tgz here]
+
Download [http://csg.sph.umich.edu/abecasis/MaCH/download/ MaCH] and [http://genome.sph.umich.edu/wiki/Minimac#Download Minimac] or [http://genome.sph.umich.edu/wiki/Minimac2#Download Minimac2]. Furthermore, example data used in this tutorial can be found [http://csg.sph.umich.edu/cfuchsb/minimac2_example.tgz here]
   −
== Minimac Imputation ==
+
== Minimac and Minimac2 Imputation ==
   −
[[Minimac]] relies on a two step approach. First, the samples that are to be analyzed must be phased into a series of estimated haplotypes. Second, imputation is carried out directly into these phased haplotypes. As newer reference panels become available, only the second step must be repeated.
+
[[Minimac]] and [[Minimac2]] relies on a two step approach. First, the samples that are to be analyzed must be phased into a series of estimated haplotypes. Second, imputation is carried out directly into these phased haplotypes. As newer reference panels become available, only the second step must be repeated.
    
=== Pre-phasing - MaCH ===
 
=== Pre-phasing - MaCH ===
Line 15: Line 15:  
  ./mach1 -d sample.dat -p sample.ped --rounds 20 --states 50 --phase --interim 5 --sample 5  --prefix sample.pp | tee mach.log
 
  ./mach1 -d sample.dat -p sample.ped --rounds 20 --states 50 --phase --interim 5 --sample 5  --prefix sample.pp | tee mach.log
   −
This will request that MaCH estimate haplotypes for your sample, using 20 iterations of its Markov sampler and conditioning each update on up to 50 haplotypes. A summary description of these parameters follows (but for a more complete description, you should go to the [http://www.sph.umich.edu/csg/abecasis/MaCH/ MaCH website]):
+
This will request that MaCH estimate haplotypes for your sample, using 20 iterations of its Markov sampler and conditioning each update on up to 50 haplotypes. A summary description of these parameters follows (but for a more complete description, you should go to the [http://csg.sph.umich.edu/abecasis/MaCH/ MaCH website]):
    
{| class="wikitable" border="1" cellpadding="2"
 
{| class="wikitable" border="1" cellpadding="2"
Line 23: Line 23:  
|-  
 
|-  
 
|style=white-space:nowrap|<code>-d sample.dat</code>
 
|style=white-space:nowrap|<code>-d sample.dat</code>
| Data file in [http://www.sph.umich.edu/csg/abecasis/Merlin/tour/input_files.html Merlin format]. Markers should be listed according to their order along the chromosome.
+
| Data file in [http://csg.sph.umich.edu/abecasis/Merlin/tour/input_files.html Merlin format]. Markers should be listed according to their order along the chromosome.
 
|-  
 
|-  
 
| <code>-p sample.ped</code>
 
| <code>-p sample.ped</code>
| Pedigree file in [http://www.sph.umich.edu/csg/abecasis/Merlin/tour/input_files.html Merlin format]. Alleles should be labeled on the forward strand.
+
| Pedigree file in [http://csg.sph.umich.edu/abecasis/Merlin/tour/input_files.html Merlin format]. Alleles should be labeled on the forward strand.
 
|-
 
|-
 
| <code>--states 200</code>
 
| <code>--states 200</code>
Line 47: Line 47:  
|}
 
|}
   −
=== Imputation into Phased Haplotypes - minimac ===
+
=== Imputation into Phased Haplotypes - minimac(2)===
   −
Imputing genotypes using '''minimac''' is a straightforward process: after selecting a set of reference haplotypes, plugging-in the target haplotypes from the previous step and setting the number of rounds to use for estimating model parameters (which describe the length and conservation of haplotype stretches shared between the reference panel and your study samples), imputation should proceed rapidly. Because marker names can change between dbSNP versions, it is usually a good idea to include ''aliases'' file that provides mappings between earlier marker names and the current preferred name for each polymorphism.
+
Imputing genotypes using '''minimac(2)''' is a straightforward process: after selecting a set of reference haplotypes, plugging-in the target haplotypes from the previous step and setting the number of rounds to use for estimating model parameters (which describe the length and conservation of haplotype stretches shared between the reference panel and your study samples), imputation should proceed rapidly.  
   −
A typical minimac command line, where the string $chr should be replaced with an appropriate chromosome number, might look like this:
+
Minimac needs a file listing the variants in your sample. If your directory already includes a "sample.snps" file, no worries. If it doesn't, you can generate one using "sample.dat" as input with the following command:
 +
 
 +
  cut -f 2 -d " " sample.dat > sample.snps
 +
 
 +
The minimac command line would look like this:
    
  ./minimac --refHaps hapmap.hap --refSnps hapmap.snps --haps sample.pp.gz --snps sample.snps --prefix sample.imp | tee minimac.log
 
  ./minimac --refHaps hapmap.hap --refSnps hapmap.snps --haps sample.pp.gz --snps sample.snps --prefix sample.imp | tee minimac.log
   −
A detailed description of all minimac options is available [[Minimac Command Reference|elsewhere]]. Here is a brief description of the above parameters:
+
or
 +
 
 +
./minimac2 --refHaps hapmap.hap --refSnps hapmap.snps --haps sample.pp.gz --snps sample.snps --prefix sample2.imp | tee minimac2.log
 +
 
 +
 
 +
A detailed description of all minimac(2) options is available [[Minimac Command Reference|elsewhere]]. Here is a brief description of the above parameters:
    
{| class="wikitable" border="1" cellpadding="2"
 
{| class="wikitable" border="1" cellpadding="2"
Line 62: Line 71:  
! Description
 
! Description
 
|-  
 
|-  
| <code>--refHaps ref.hap.gz </code>  
+
| <code>--refHaps hapmap.hap </code>  
| Reference haplotypes (e.g. from [http://www.sph.umich.edu/csg/abecasis/MACH/download/1000G-2010-06.html MaCH download page])
+
| Reference haplotypes (e.g. from HapMap or the 1000 genomes Project).
 +
|-
 +
| <code>--refSnps hapmap.snps </code>
 +
| List of sites in the reference haplotypes; needed unless the reference haplotypes are in VCF format.
 
|-
 
|-
 
| <code>--vcfReference </code>  
 
| <code>--vcfReference </code>  
Line 87: Line 99:       −
You can speed-up things by running minimac in parallel by launching the [http://genome.sph.umich.edu/wiki/Minimac#Multiprocessor_Version minimac-omp] version. On our cluster 4 cpus per minimac is optimal (--cpus 4).
+
You can speed-up things by running minimac in parallel by launching the [http://genome.sph.umich.edu/wiki/Minimac2#Multiprocessor_Version minimac2-omp] version. On our cluster 4 cpus per minimac(2) is optimal (--cpus 4).
    
  ./minimac-omp --cpus 4 --refHaps hapmap.hap --refSnps hapmap.snps --haps sample.pp.gz --snps sample.snps --prefix sample.imp | tee minimac-omp.log
 
  ./minimac-omp --cpus 4 --refHaps hapmap.hap --refSnps hapmap.snps --haps sample.pp.gz --snps sample.snps --prefix sample.imp | tee minimac-omp.log
    +
or
 +
 +
./minimac2-omp --cpus 4 --refHaps hapmap.hap --refSnps hapmap.snps --haps sample.pp.gz --snps sample.snps --prefix sample2.imp | tee minimac-omp.log
    
== Imputation quality evaluation ==
 
== Imputation quality evaluation ==
Line 104: Line 119:  
= Reference =
 
= Reference =
   −
If you use minimac, please cite:  
+
If you use minimac or minimac2, please cite:  
    
Howie B, Fuchsberger C, Stephens M, Marchini J, and Abecasis GR.
 
Howie B, Fuchsberger C, Stephens M, Marchini J, and Abecasis GR.

Navigation menu