Difference between revisions of "Mach DAC"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 7: Line 7:
 
Within each file, markers should be stored by chromosome position. Alleles should be stored in the forward strand and can be encoded as 'A', 'C', 'G' or 'T' (there is no need to use numeric identifiers for each allele). <br>
 
Within each file, markers should be stored by chromosome position. Alleles should be stored in the forward strand and can be encoded as 'A', 'C', 'G' or 'T' (there is no need to use numeric identifiers for each allele). <br>
  
 +
=== Split Your Data ===
 +
You can split your data using [http://www.sph.umich.edu/csg/yli/splitPed/ splitPed].
  
 
== Phase/Imputation with External Reference ==
 
== Phase/Imputation with External Reference ==

Revision as of 14:52, 29 November 2010

This is the MaCH Divide and Conquer page, documenting how to break the genome into smaller pieces before imputation/phasing and how to ligate after imputation/phasing.

Phase without External Reference

Your Data

To get started, you will need to store your data in Merlin format pedigree and data files, one per chromosome. For details of the Merlin file format, see the Merlin tutorial [1].

Within each file, markers should be stored by chromosome position. Alleles should be stored in the forward strand and can be encoded as 'A', 'C', 'G' or 'T' (there is no need to use numeric identifiers for each allele).

Split Your Data

You can split your data using splitPed.

Phase/Imputation with External Reference

Questions and Comments?

Email Yun Li.