Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Created page with '=== Introduction === This page details a Hardy-Weinberg Equilibrium test based on genotype likelihoods in NGS data. === Formulation === Hardy Weinberg equilibrium is expected …'
=== Introduction ===

This page details a Hardy-Weinberg Equilibrium test based on genotype likelihoods in NGS data.

=== Formulation ===

Hardy Weinberg equilibrium is expected in a panmictic population. The following formulation is a likelihood ratio test statistic that incorporates genotype uncertainty via genotype likelihoods.
<math>P(R_{k}|\textbf{p})</math> is the probability of observing the reads for individual <math>k</math> assuming that a locus observes HWE.
<math>P(R_{k}|\textbf{g})</math> is the probability of observing the reads for individual <math>k</math> assuming that a locus does not observe HWE.
<math>G_{i,j}</math> denotes the genotype composed of alleles <math>i</math> and <math>j</math> . <math>k</math> indexes the individuals from <math>1</math> to <math> N</math> .
<math>P(R_{k} |G_{i,j})</math> is the genotype likelihood.
<math>P(G_{i,j}|\textbf{p})</math> and <math>P(G_{i,j}|\textbf{g})</math> are the [[AF|genotype frequencies estimated with and without HWE assumption]] respectively.


<math>
\begin{align}
L(R|g) & = \frac{\prod_{k}{P(R_{k}|\textbf{p})}}
{\prod_{k}{P(R_{k}|\textbf{g})}} \\
& = \frac{\prod_{k}{\sum_{i,j}{P(R_{k}, G_{i,j}|\textbf{p})}}}
{\prod_{k}{\sum_{i,j}{P(R_{k}, G_{i,j}|\textbf{g})}}} \\
& = \frac{\prod_{k}{\sum_{i,j}{P(R_{k} |G_{i,j} )P(G_{i,j}|\textbf{p})}}}
{\prod_{k}{\sum_{i,j}{P(R_{k} |G_{i,j})P(G_{i,j}|\textbf{g})}}} \\
\end{align}
</math>


The likelihood ratio test statistic is as follows with <math>v</math> degrees of freedom where <math>n</math> is the number of alleles.

<math>
\begin{align}
-2logL(R|g) \sim X^2_v, v = \frac{n(n-1)}{2}
\end{align}
</math>

=== Derivation ===

Hyun.

=== Maintained by ===

This page is maintained by [mailto:atks@umich.edu Adrian]
1,102

edits

Navigation menu