Difference between revisions of "ExomePicks"

From Genome Analysis Wiki
Jump to navigationJump to search
(Created page with ''''ExomePicks''' is a program that suggests individuals to be sequenced in a large pedigree. '''ExomePicks''' assumes that a genotyping chip or another cost effective means will …')
 
Line 13: Line 13:
 
Here is an example of a minimal data file:
 
Here is an example of a minimal data file:
  
   > contents of small.dat <
+
   '''< contents of small.dat >'''
 
   A dna
 
   A dna
 
   A disease
 
   A disease
Line 21: Line 21:
 
Here is an example of a simple pedigree file:
 
Here is an example of a simple pedigree file:
 
   
 
   
   > contents of small.ped <
+
   '''< contents of small.ped >'''
 
   1 I1  0    0    1  2 1
 
   1 I1  0    0    1  2 1
 
   1 I2  0    0    2  2 1
 
   1 I2  0    0    2  2 1
Line 38: Line 38:
 
If you have pen and paper, you should be able to verify that the file describes a simple four generation pedigree. The first five columns denote family id, individual id, father id, mother id, and sex. DNA is available for all individuals (value '''2''' in DNA column) and all individuals are unaffected (value '''1''' in disease column).
 
If you have pen and paper, you should be able to verify that the file describes a simple four generation pedigree. The first five columns denote family id, individual id, father id, mother id, and sex. DNA is available for all individuals (value '''2''' in DNA column) and all individuals are unaffected (value '''1''' in disease column).
  
ExomePicks currently ignores any information on twin status that may be present.  
+
ExomePicks currently ignores any information on twin status that may be present.
  
 
== Instructions ==
 
== Instructions ==

Revision as of 09:28, 3 March 2010

ExomePicks is a program that suggests individuals to be sequenced in a large pedigree. ExomePicks assumes that a genotyping chip or another cost effective means will be used to determine IBD sharing in the pedigree and that, subsequently, one would like to sequence a minimal number of individuals and use their sequences together with IBD information to deduce the sequence of other individuals in the pedigree.

Download

A source code package can be downloaded from here.

Input Files

A pedigree and data file in Merlin format are required as input.

The data file describes the contents of the pedigree file and should include, minimally, an entry to specify which individuals in the pedigree are genotyped. An additional entry to indicate which individuals are affected for a trait of interest can also be included, but is not required. Any other entries that are present will be safely ignored.

Here is an example of a minimal data file:

 < contents of small.dat >
 A dna
 A disease

The pedigree file describes relationships among individuals and indicates samples for whom DNA is available (and who can thus be selected for sequencing) and who are affected (and, perhaps, more valuable).

Here is an example of a simple pedigree file:

 < contents of small.ped >
 1 I1   0    0    1  2 1
 1 I2   0    0    2  2 1
 1 I3   0    0    1  2 1
 1 I4   0    0    2  2 1
 1 II1  I1   I2   1  2 1
 1 II2  I3   I4   2  2 1
 1 III1 0    0    1  2 1
 1 III2 II1  II2  2  2 1
 1 III3 II1  II2  2  2 1
 1 III4 II1  II2  2  2 1
 1 III5 II1  II2  2  2 1
 1 IV1  III1 III2 1  2 1
 1 IV2  III1 III2 1  2 1 

If you have pen and paper, you should be able to verify that the file describes a simple four generation pedigree. The first five columns denote family id, individual id, father id, mother id, and sex. DNA is available for all individuals (value 2 in DNA column) and all individuals are unaffected (value 1 in disease column).

ExomePicks currently ignores any information on twin status that may be present.

Instructions

The only essential command line options are those that specify input file names, thus:

 ExomePicks -d small.dat -p small.ped