ExomePicks

From Genome Analysis Wiki
Jump to navigationJump to search

ExomePicks is a program that suggests individuals to be sequenced in a large pedigree. ExomePicks assumes that a genotyping chip or another cost effective means will be used to determine IBD sharing in the pedigree and that, subsequently, one would like to sequence a minimal number of individuals and use their sequences together with IBD information to deduce the sequence of other individuals in the pedigree.

Download

A source code package can be downloaded from here.

Input Files

A pedigree and data file in Merlin format are required as input.

The data file describes the contents of the pedigree file and should include, minimally, an entry to specify which individuals in the pedigree are genotyped. An additional entry to indicate which individuals are affected for a trait of interest can also be included, but is not required. Any other entries that are present will be safely ignored.

Here is an example of a minimal data file:

 < contents of small.dat >
 A dna
 A disease

The pedigree file describes relationships among individuals and indicates samples for whom DNA is available (and who can thus be selected for sequencing) and who are affected (and, perhaps, more valuable).

Here is an example of a simple pedigree file:

 < contents of small.ped >
 1 I1   0    0    1  2 1
 1 I2   0    0    2  2 1
 1 I3   0    0    1  2 1
 1 I4   0    0    2  2 1
 1 II1  I1   I2   1  2 1
 1 II2  I3   I4   2  2 1
 1 III1 0    0    1  2 1
 1 III2 II1  II2  2  2 1
 1 III3 II1  II2  2  2 1
 1 III4 II1  II2  2  2 1
 1 III5 II1  II2  2  2 1
 1 IV1  III1 III2 1  2 1
 1 IV2  III1 III2 1  2 1 

If you have pen and paper, you should be able to verify that the file describes a simple four generation pedigree. The first five columns denote family id, individual id, father id, mother id, and sex. DNA is available for all individuals (value 2 in DNA column) and all individuals are unaffected (value 1 in disease column).

ExomePicks currently ignores any information on twin status that may be present.

Instructions

The only essential command line options are those that specify input file names, thus:

 ExomePicks -d small.dat -p small.ped