Difference between revisions of "Relationship between Ploidy, Alleles and Genotypes"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 40: Line 40:
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
! scope="col"| Case
+
! scope="col"| Ploidy
 
! scope="col"| Alleles
 
! scope="col"| Alleles
 
! scope="col"| Genotypes
 
! scope="col"| Genotypes
 
! scope="col"| Index
 
! scope="col"| Index
! scope="col"| comments
 
 
|-
 
|-
| ploidy \le  alleles
+
| 1
| ploidy  alleles
+
| A
<math>
+
| A
  \begin{align}
+
| Simple haploid case
P(G_{i,j}|R_{k})^{(l)}=\frac{P(R_{k}|G_{i,j})P(G_{i,j})^{(l-1)}}{\sum_{(i,j)}{P(R_{k}|G_{i,j})P(G_{i,j})^{(l-1)}}}
 
  \end{align}
 
</math>
 
| 18794
 
| 18849
 
| bcftools's normalization is buggy, variants were truncated despite having differing prefix.
 
 
|-
 
|-
| ploidy gt alleles
+
| 2
| -
+
| A
| -
+
|  
| -
+
| Diploid Case
| -
 
 
|-
 
|-
 
| #normalized after gatk
 
| #normalized after gatk
| -
 
 
| 0
 
| 0
 
| 57
 
| 57
Line 71: Line 62:
 
| #normalized after vt
 
| #normalized after vt
 
| -
 
| -
| 0
 
 
| 0
 
| 0
 
| no variants processed by vt were further normalized.
 
| no variants processed by vt were further normalized.
 
|}
 
|}

Revision as of 10:36, 31 January 2015

Introduction

The VCF format encodes genotypes by the index of the enumeration of genotypes give a ploidy number and alleles. Ploidy and alles are independent of one another while genotypes are a function of them.

Motivation

While there are explicit functions that could be googled for handling haploid and diploidy cases. It seems to be difficult to find the closed forms for the general case. This wiki fills in that need. The cases where one requires such extensions is when pooled samples are studied or when plant species that exhibit a diverse number of ploidy.

The number of genotypes given a ploidy and alleles

The indexing of genotypes given a ploidy and alleles


where a_1, a_2 .... are the alleles in numeric encoding (0 to A-1) and are ordered (AB, ABCCCC). For example ACB is not ordered.

Simple cases

Ploidy Alleles Genotypes Index
1 A A Simple haploid case
2 A Diploid Case
#normalized after gatk 0 57 57 variants from GATK's normalization were left aligned by vt. 6 were biallelic and 51 were multiallelic. Note that 2 variants were changed by GATK but were not completely normalized.
#normalized after vt - 0 no variants processed by vt were further normalized.