Difference between revisions of "GlfMultiples"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 29: Line 29:
 
                                 to be matched to a more specific individual identifier. The aliases file should include two columns per row,
 
                                 to be matched to a more specific individual identifier. The aliases file should include two columns per row,
 
                                 the first specifying the VCF filename, the second specifying a sample name.
 
                                 the first specifying the VCF filename, the second specifying a sample name.
 
 
<!--
 
<!--
 
=== X Chromosome Variant Calling ===
 
=== X Chromosome Variant Calling ===

Revision as of 05:42, 20 May 2010

glfMultiples is a GLF-based variant caller for next-generation sequencing data. It takes a set of GLF format genotype likelihood files as input and generates a VCF-format set of variant calls as output.

Basic Usage Example

In a typical command line, a series of options controlling variant calling appear first and are followed by a trailing list of GLF-format likelihood files. Here is an example of how glfMultiples works:

  glfMultiples --minMapQuality 30 --minTotalDepth 60 --maxTotalDepth 240 -b YRI.SLX.vcf YRI/NA*.SLX.glf > YRI.SLX.log

Command Line Options

Basic Output Options

 -b baseCallFile                Specifies the name of the output VCF-format base call file
 -p threshold                   The threshold for base calling. Base calls will be made when their posterior likelihood exceeds threshold

Filtering According to Depth and Map Quality

 --minMapQuality threshold      Positions where the root-means squared mapping quality falls below this threshold will be excluded.
 --strict                       When the map quality is interpreted strictly, all three trio individuals must exceed minMapQuality 
                                before a call is made. Without the --strict option, reads for individuals below the threshold are ignored.
 --minDepth threshold           Positions where the read depth falls below this threshold will be excluded.
 --maxDepth threshold           Positions where the read depth exceeds this threshold will be excluded.

VCF Output

 --glfAliases filename          By default, GLF filenames are used to label each column in the VCF file. This option allows each filename
                                to be matched to a more specific individual identifier. The aliases file should include two columns per row,
                                the first specifying the VCF filename, the second specifying a sample name.

Download

The current version is available for download from here.

TODO

Support for X chromosome variant calling.

Support for two-pass depth filter that looks at the data to work out appropriate thresholds for shallow and deep coverage.