The 1000 Genome pilot project genotypes use NCBI Build 36.
=== Step 1: Pre-Phasing === For the pre-phasing step we recommend [[MaCH]] using the --phase command line option. As input [[MaCH]] needs a [[Merlin]] format pedigree and data file. All markers must be ordered according to their physical position. ==== Usage ==== mach1 -d sample.dat -p sample.ped --rounds 20 --states 200 --phase --interim 1 --sample 1 --compact ==== Parameters ==== --rounds Rhow many iterations of the Markov sampler should be run.  --states STuse a random subset of ST haplotypes as reference. We recommend values between 200 - 500. More states result in more accurate haplotypes, but are computational more expensive.  --interim Ioutput a set of best-guess haplotypes every I rounds by building consensus from all previous Markov iterations. These haplotypes can be used for imputation.  --sample SAoutput a set of haplotypes every SA rounds based on random sampling from the last Markov iteration. These intermediate results can be combined and used as input for the imputation process.  --phase enables [[MaCH]] phasing mode.  --compactreduces the amount of memory needed dramatically, but doubles execution time.  
=== Step 2: Imputation ===

