Genotype Likelihood based Hardy-Weinberg Test

From Genome Analysis Wiki
(Redirected from HWEP)
Jump to: navigation, search

Contents

Introduction

This page details a Hardy-Weinberg Equilibrium test based on genotype likelihoods in NGS data.

Formulation

Hardy Weinberg equilibrium is expected in a panmictic population. The following formulation is a likelihood ratio test statistic that incorporates genotype uncertainty via genotype likelihoods. P(R_{k}|\textbf{p}) is the probability of observing the reads for individual k assuming that a locus observes HWE. P(R_{k}|\textbf{g}) is the probability of observing the reads for individual k assuming that a locus does not observe HWE. G_{i,j} denotes the genotype composed of alleles i and j . k indexes the individuals from 1 to  N . P(R_{k} |G_{i,j}) is the genotype likelihood. P(G_{i,j}|\textbf{p}) and P(G_{i,j}|\textbf{g}) are the genotype frequencies estimated with and without HWE assumption respectively.



\begin{align}
  L(R|g) & =  \frac{\prod_{k}{P(R_{k}|\textbf{p})}}
                    {\prod_{k}{P(R_{k}|\textbf{g})}} \\
         & =   \frac{\prod_{k}{\sum_{i,j}{P(R_{k}, G_{i,j}|\textbf{p})}}}
                    {\prod_{k}{\sum_{i,j}{P(R_{k}, G_{i,j}|\textbf{g})}}} \\
         & =   \frac{\prod_{k}{\sum_{i,j}{P(R_{k} |G_{i,j} )P(G_{i,j}|\textbf{p})}}}
                    {\prod_{k}{\sum_{i,j}{P(R_{k} |G_{i,j})P(G_{i,j}|\textbf{g})}}} \\
\end{align}


The likelihood ratio test statistic is as follows with v degrees of freedom where n is the number of alleles.


\begin{align}
  -2logL(R|g) \sim X^2_v, v  = \frac{n(n-1)}{2}
\end{align}

Derivation

Hyun.

Implementation

This is implemented in vt.

Maintained by

This page is maintained by Adrian