Difference between revisions of "MaCH: machX"

From Genome Analysis Wiki
Jump to navigationJump to search
 
(9 intermediate revisions by 3 users not shown)
Line 1: Line 1:
This page documents how to perform X chromosome (non-pseudo-autosomal part) imputation using MaCH [http://www.sph.umich.edu/csg/yli/mach] and minimac [http://genome.sph.umich.edu/wiki/Minimac].
+
This page documents how to perform X chromosome (non-pseudo-autosomal part) imputation using MaCH [http://csg.sph.umich.edu/csg/yli/mach] and minimac [http://genome.sph.umich.edu/wiki/Minimac].  
  
== Getting Started ==
+
== Getting Started ==
  
=== Your Own Data ===
+
=== Your Own Data ===
To get started, you will need to store your data in [[Merlin]] format pedigree and data files, one per chromosome. For details of the Merlin file format, see the Merlin tutorial [http:/www.sph.umich.edu/csg/abecasis/Merlin/tour/input_files.html].
 
  
Within each file, markers should be stored by chromosome position. Alleles should be stored in the forward strand and can be encoded as 'A', 'C', 'G' or 'T' (there is no need to use numeric identifiers for each allele).  
+
To get started, you will need to store your data in [[Merlin]] format pedigree and data files, one per chromosome. For details of the Merlin file format, see the Merlin tutorial [http://csg.sph.umich.edu//abecasis/Merlin/tour/input_files.html]. <br>
  
Note that for males hemizygotes are coded as homozygotes.  
+
Within each file, markers should be stored by chromosome position. Alleles should be stored in the forward strand and can be encoded as 'A', 'C', 'G' or 'T' (there is no need to use numeric identifiers for each allele). <br>
  
=== Reference Haplotypes ===
+
Note that for males hemizygotes are coded as homozygotes. <br>
You can download the following reference haplotypes from MaCH download page [http://www.sph.umich.edu/csg/yli/mach/download/chrX.html].
 
  
== Two-Step Imputation ==
+
=== Reference Haplotypes  ===
  
=== Phase Your Own Data ===
+
You can download the reference haplotypes from MaCH download page [http://csg.sph.umich.edu//yli/mach/download/chrX.html].
  
=== Impute ===
+
== Two-Step Imputation  ==
 +
 
 +
=== Phase Your Own Data  ===
 +
 
 +
If there is no missing genotypes in males, you will only need to phase the females. Make sure that alleles are all stored in forward strand before phasing.
 +
 
 +
  mach1 -d sample.dat -p sample.ped --states 200 -r 20 --phase -o sample.phased > sample.phased.log
 +
 
 +
=== Impute ===
 +
 
 +
Imputation will then be performed on the phased haplotypes using minimac [http://genome.sph.umich.edu/wiki/Minimac].
 +
 
 +
  minimac --refHaps ref.hap.gz --refSnps ref.snps --haps sample.phased.gz --snps sample.snps --rounds 5 --states 200 --prefix sample.imputed > sample.imputed.log
  
 
== FAQ ==
 
== FAQ ==
 +
=== Shall I phase/impute males and females together or separately? ===
 +
Phasing males together with or separately from females doesn't seem to affect imputation quality.
 +
 +
Imputing males together with or separately from females doesn't seem to affect imputation quality either.
 +
 +
== Questions and Comments?  ==
 +
 +
Email [mailto:yunli@med.unc.edu Yun Li].

Latest revision as of 12:02, 2 February 2017

This page documents how to perform X chromosome (non-pseudo-autosomal part) imputation using MaCH [1] and minimac [2].

Getting Started

Your Own Data

To get started, you will need to store your data in Merlin format pedigree and data files, one per chromosome. For details of the Merlin file format, see the Merlin tutorial [3].

Within each file, markers should be stored by chromosome position. Alleles should be stored in the forward strand and can be encoded as 'A', 'C', 'G' or 'T' (there is no need to use numeric identifiers for each allele).

Note that for males hemizygotes are coded as homozygotes.

Reference Haplotypes

You can download the reference haplotypes from MaCH download page [4].

Two-Step Imputation

Phase Your Own Data

If there is no missing genotypes in males, you will only need to phase the females. Make sure that alleles are all stored in forward strand before phasing.

  mach1 -d sample.dat -p sample.ped --states 200 -r 20 --phase -o sample.phased > sample.phased.log

Impute

Imputation will then be performed on the phased haplotypes using minimac [5].

 minimac --refHaps ref.hap.gz --refSnps ref.snps --haps sample.phased.gz --snps sample.snps --rounds 5 --states 200 --prefix sample.imputed > sample.imputed.log

FAQ

Shall I phase/impute males and females together or separately?

Phasing males together with or separately from females doesn't seem to affect imputation quality.

Imputing males together with or separately from females doesn't seem to affect imputation quality either.

Questions and Comments?

Email Yun Li.