From Genome Analysis Wiki
Jump to navigationJump to search
642 bytes added
, 21:10, 28 November 2010
Line 1: |
Line 1: |
| + | == How to speed up? == |
| + | |
| + | === minimac === |
| + | See [http://genome.sph.umich.edu/wiki/Minimac minimac] for details. |
| + | |
| + | === Divide and Conquer === |
| + | See [http://genome.sph.umich.edu/wiki/Mach_DAC MaCH Divide and Conquer] for details. |
| + | |
| + | === 2-step imputation === |
| + | See [http://genome.sph.umich.edu/wiki/MaCH_FAQ#Why_and_how_to_perform_a_2-step_imputation.3F 2-step imputation] for details. |
| + | |
| == Why and how to perform a 2-step imputation? == | | == Why and how to perform a 2-step imputation? == |
| | | |
Line 14: |
Line 25: |
| # step 2: | | # step 2: |
| mach1 -d sample.dat -p sample.ped -s chr20.snps -h chr20.hap --compact --greedy --autoFlip --errorMap par_infer.erate --crossoverMap par_infer.rec --mle --mldetails > mach.imp.log | | mach1 -d sample.dat -p sample.ped -s chr20.snps -h chr20.hap --compact --greedy --autoFlip --errorMap par_infer.erate --crossoverMap par_infer.rec --mle --mldetails > mach.imp.log |
| + | |
| + | In step1, one can use --greedy in combination with --states XX in MaCH versions 16.b and above. We have found that using 1/3 of the reference haplotypes (with 1/9 computational time) results in almost no power loss for the current HapMap and 1000G reference panels. |
| | | |
| In step2, each individual is imputed independently and can therefore be split into as many as n (sample size) jobs for each chromosome for parallelism. | | In step2, each individual is imputed independently and can therefore be split into as many as n (sample size) jobs for each chromosome for parallelism. |