From Genome Analysis Wiki
Jump to navigationJump to search
1,595 bytes added
, 11:34, 26 September 2013
Line 1: |
Line 1: |
| + | [[Category:Software]] |
| '''glfSingle''' is a [[GLF]]-based variant caller for next-generation sequencing data. It takes a [[GLF]] format genotype likelihood file as input and generates a [[VCF]]-format set of variant calls as output. | | '''glfSingle''' is a [[GLF]]-based variant caller for next-generation sequencing data. It takes a [[GLF]] format genotype likelihood file as input and generates a [[VCF]]-format set of variant calls as output. |
| | | |
Line 17: |
Line 18: |
| --minDepth ''threshold'' Positions where the read depth falls below this threshold will be excluded. | | --minDepth ''threshold'' Positions where the read depth falls below this threshold will be excluded. |
| --maxDepth ''threshold'' Positions where the read depth exceeds this threshold will be excluded. | | --maxDepth ''threshold'' Positions where the read depth exceeds this threshold will be excluded. |
− | --reference Positions called as homozygous reference will be included in the output. | + | --reference Positions called as homozygous reference will be included in the output. |
− |
| + | |
| + | To learn about default values for these options, simply run the program with no arguments. |
| + | |
| == Model for Variant Calling == | | == Model for Variant Calling == |
| + | glfSingle uses a likelihood-based model for variant calling. Starting from genotype likelihoods ''Pr(reads| genotype)'' per genomic position, computed from appropriate tools (eg. Samtools BAQ), the likelihoods combine with an individual-based prior ''p(genotype)'' to generate posterior probabilities ''Pr(genotype| reads)''. |
| + | |
| + | Ingredients that go into prior: |
| + | *All sites have an equal probability of showing polymorphism: |
| + | **P(non-reference base) = 0.001 |
| + | *When a site shows polymorphism, it is usually heterozygous: |
| + | **P(non-reference heterozygote) = 0.01 * 2/3 |
| + | **P(non-reference homozygote) = 0.01 * 1/3 |
| + | *Mutation model: Transitions (C <-> T or A <-> G) accounts for most variants, while transversions account for minority of variants |
| + | **transition has 2/3 probability |
| + | **each transversion has 1/6 probability |
| + | |
| + | *'''New implementation''': Alternative mutation model with uniform (uninformative) prior for transition to transversion ratio |
| + | **updated by Yancy Lo, 9/24/2012 |
| + | **each mutation has a 1/3 probability |
| + | **add --uniformTsTv in command line to enable this alternative mutation model |
| + | **download glfSingle with this new implementation here: [[File:Generic-glfSingle-2013-09-25.tar.gz]] |
| + | |
| + | == Download == |
| + | |
| + | For the current of glfSingle, please go to [http://www.sph.umich.edu/csg/abecasis/glfTools/ our GLF Tools Website]. |
| + | |
| + | == TODO == |
| + | |
| + | Support for X chromosome variant calling. |
| + | |
| + | Support for a two pass depth filter that uses the data to automatically work out appropriate filtering thresholds. |