MaCH: machX

From Genome Analysis Wiki
Revision as of 08:29, 29 November 2010 by Ylwtx (talk | contribs)
Jump to navigationJump to search

This page documents how to perform X chromosome (non-pseudo-autosomal part) imputation using MaCH [1] and minimac [2].

Getting Started

Your Own Data

To get started, you will need to store your data in Merlin format pedigree and data files, one per chromosome. For details of the Merlin file format, see the Merlin tutorial [3].

Within each file, markers should be stored by chromosome position. Alleles should be stored in the forward strand and can be encoded as 'A', 'C', 'G' or 'T' (there is no need to use numeric identifiers for each allele).

Note that for males hemizygotes are coded as homozygotes.

Reference Haplotypes

You can download the reference haplotypes from MaCH download page [4].

Two-Step Imputation

Phase Your Own Data

If there is no missing genotypes in males, you will only need to phase the females. Make sure that alleles are all stored in forward strand before phasing.

  mach1 -d sample.dat -p sample.ped --states 200 -r 20 --phase -o sample.phased > sample.phased.log


Imputation will then be performed on the phased haplotypes using minimac [5].

 minimac --refHaps ref.hap.gz --refSnps ref.snps --haps sample.phased.gz --snps sample.snps --rounds 5 --states 200 --prefix sample.imputed > sample.imputed.log


Shall I phase/impute males and females together or separately?

When there is missing genotypes among males, phasing is needed for males as well. But phasing them together with or separately from females doesn't seem to affect imputation quality.

Imputing males together with or separately from females doesn't seem to affect imputation quality either.

Questions and Comments?

Email Yun Li.