Line 2: |
Line 2: |
| * [[MaCH]] (concurrent phasing approach). | | * [[MaCH]] (concurrent phasing approach). |
| OR | | OR |
− | * [[Minimac]] (pre-phasing approach). | + | * [[Minimac]] (pre-phasing / 2-step approach). |
| | | |
| | | |
Line 32: |
Line 32: |
| | | |
| Reference haplotypes generated by the 1000 Genomes project and formatted so that they are ready for analysis are available from the [http://www.sph.umich.edu/csg/abecasis/MaCH/download/1000G-2010-08.html MaCH download page]. In our hands, this August 2010 release is substantially better than previous 1000 Genome Project genotype call sets. | | Reference haplotypes generated by the 1000 Genomes project and formatted so that they are ready for analysis are available from the [http://www.sph.umich.edu/csg/abecasis/MaCH/download/1000G-2010-08.html MaCH download page]. In our hands, this August 2010 release is substantially better than previous 1000 Genome Project genotype call sets. |
| + | |
| | | |
| == MaCH Imputation == | | == MaCH Imputation == |
− |
| |
| | | |
| === Estimating Model Parameters === | | === Estimating Model Parameters === |
Line 119: |
Line 119: |
| | | |
| | | |
− | == minimac Imputation == | + | == Pre-phasing / 2-Step Imputation == |
| | | |
| === Pre-phasing - MaCH === | | === Pre-phasing - MaCH === |
| | | |
− | A typical MaCH command line to estimate phased haplotypes might look like this: | + | Pre-phasing / 2-Step imputation starts with the pre-phasing of your genotypes using MaCH. A typical MaCH command line to estimate phased haplotypes might look like this: |
| | | |
− | mach1 -d sample.dat -p sample.ped --rounds 20 --states 200 --phase --interim 5 --sample 5 --compact | + | mach1 -d sample.dat -p sample.ped --rounds 20 --states 200 --phase --interim 5 --sample 5 |
| | | |
− | This will request that MaCH estimate haplotypes for your sample, using 20 iterations of its Markov sampler and conditioning each update on up to 200 haplotypes. A summary description of these parameters follows (but for a more complete description, you should go to the MaCH website): | + | This will request that MaCH estimate haplotypes for your sample, using 20 iterations of its Markov sampler and conditioning each update on up to 200 haplotypes. |
| + | A summary description of these parameters follows (but for a more complete description, you should go to the MaCH website): |
| | | |
| {| class="wikitable" border="1" cellpadding="2" | | {| class="wikitable" border="1" cellpadding="2" |
Line 141: |
Line 142: |
| |- | | |- |
| | <code>--states 200</code> | | | <code>--states 200</code> |
− | | Number of haplotypes to consider during each update. Increasing this value will typically lead to better haplotypes, but can dramatically increase computing time and memory use. A value of 100 - 400 is typical. | + | | Number of haplotypes to consider during each update. Increasing this value will typically lead to better haplotypes, but can dramatically increase computing time and memory use. A value of 200 - 400 is typical. |
| |- | | |- |
| | <code>--rounds 20</code> | | | <code>--rounds 20</code> |
− | | Iterations of the Markov sampler to use for haplotyping. Typically, using 20 - 100 rounds should give good results. To obtain better results, it is usually better to increase the <code>--states</code> parameter. | + | | Iterations of the Markov sampler to use for haplotyping. Typically, using 20-30 rounds should give good results. To obtain better results, it is usually better to increase the <code>--states</code> parameter. |
| |- | | |- |
| | <code>--interim 5</code> | | | <code>--interim 5</code> |
Line 156: |
Line 157: |
| |- | | |- |
| | <code>--compact</code> | | | <code>--compact</code> |
− | | Reduce memory use at the cost of approximately doubling runtime. This option is recommended for most GWAS scale datasets and computing platforms. | + | | Reduce memory use at the cost of approximately doubling runtime. |
| |} | | |} |
| | | |