Your Own Data
Within each file, markers should be stored by chromosome position. Alleles should be stored in the forward strand and can be encoded as 'A', 'C', 'G' or 'T' (there is no need to use numeric identifiers for each allele).
Note that for males hemizygotes are coded as homozygotes.
You can download the reference haplotypes from MaCH download page .
Phase Your Own Data
If there is no missing genotypes in males, you will only need to phase the females. Make sure that alleles are all stored in forward strand before phasing.
mach1 -d sample.dat -p sample.ped --states 200 -r 20 --phase -o sample.phased > sample.phased.log
Imputation will then be performed on the phased haplotypes using minimac .
minimac --refHaps ref.hap.gz --refSnps ref.snps --haps sample.phased.gz --snps sample.snps --rounds 5 --states 200 --prefix sample.imputed > sample.imputed.log
Shall I phase/impute males and females together or separately?
When there is missing genotypes among males, phasing is needed for males as well. But phasing them together with or separately from females doesn't seem to affect imputation quality.
Imputing males together with or separately from females doesn't seem to affect imputation quality either.
Questions and Comments?
Email Yun Li.