From Genome Analysis Wiki
Jump to navigationJump to search
399 bytes added
, 03:32, 12 October 2010
Line 42: |
Line 42: |
| | | |
| ==== Parameters ==== | | ==== Parameters ==== |
− | -p sample.ped
| |
− | pedigree file in [http://www.sph.umich.edu/csg/abecasis/Merlin/tour/input_files.html Merlin] format
| |
| | | |
− | -d sample.dat
| + | {| class="wikitable" border="1" cellpadding="2" |
− | dat file in [http://www.sph.umich.edu/csg/abecasis/Merlin/tour/input_files.html Merlin] format
| + | |- bgcolor="lightgray" |
− | | + | ! Parameter |
− | --rounds R
| + | ! Description |
− | how many iterations of the Markov sampler should be run.
| + | |- |
− | | + | | <code>-d sample.dat</code> |
− | --states ST
| + | | Data file in [http://www.sph.umich.edu/csg/abecasis/Merlin/tour/input_files.html Merlin format]. It is important that markers should be listed according to their order along the chromosome. |
− | use a random subset of ST haplotypes as reference. We recommend values between 200 - 500. More states result in more accurate haplotypes, but are computational more expensive and require more memory.
| + | |- |
− | | + | | <code>-p sample.ped</code> |
− | --interim I
| + | | Pedigree file in [http://www.sph.umich.edu/csg/abecasis/Merlin/tour/input_files.html Merlin format]. It is important that alleles should be labeled on the forward strand. |
− | output a set of best-guess haplotypes every I rounds by building consensus from all previous Markov iterations. These intermediate haplotypes can be used for imputation.
| + | |- |
− | | + | | <code>--states 200</code> |
− | --sample SA
| + | | Number of haplotypes to consider during each update. Increasing this value will typically lead to better haplotypes, but can also dramatically increase computing time and memory requirements. A value of 100 - 400 is typical. |
− | output a set of haplotypes every SA rounds based on random sampling from the last Markov iteration. These intermediate results can be combined and used as input for the imputation process.
| + | |- |
− | | + | | <code>--rounds 50</code> |
− | --phase
| + | | Iterations of the Markov sampler to use for haplotyping. Typically, using 20 - 100 rounds should give good results. To obtain better results, it is usually better to increase the <code>--states</code> parameter. |
− | enables [[MaCH]] phasing mode.
| + | |- |
− | | + | | <code>--interim 5</code> |
− | --compact
| + | | Request that intermediate results should be saved to disk periodically. |
− | reduces the amount of memory needed dramatically, but doubles execution time.
| + | |- |
| + | | <code>--phase</code> |
| + | | Tell [[MaCH]] to estimate phased haplotypes for each individual. |
| + | |- |
| + | | <code>--compact</code> |
| + | | Tell [[MaCH]] to reduce memory use at the cost of approximately doubling runtime. This option is recommended for most GWAS scale datasets and computing platforms. |
| + | |} |
| | | |
| === Step 2: Imputation === | | === Step 2: Imputation === |