Changes

From Genome Analysis Wiki
Jump to navigationJump to search
2,329 bytes removed ,  13:38, 4 June 2013
Line 1: Line 1: −
=== Estimation of Genotype Frequencies without assuming HWE ===
+
#REDIRECT [[Genotype_Likelihood_based_Allele_Frequency]]
 
  −
We propose an EM algorithm to estimate the genotype frequencies without assuming HWE.  The posterior probability of the genotype given the reads for individual k  (<math>R_k</math>) for the <math>l</math>th iteration is given by:
  −
 
  −
<math>
  −
  \begin{align}
  −
P(G_{i,j}|R_{k})^{(l)}=\frac{P(R_{k}|G_{i,j})P(G_{i,j})^{(l-1)}}{\sum_{(i,j)}{P(R_{k}|G_{i,j})P(G_{i,j})^{(l-1)}}}
  −
  \end{align}
  −
</math>
  −
 
  −
where <math>G_{i,j}</math> denotes the genotype composed of alleles <math>i</math> and <math>j</math>.  <math>k</math> indexes the individuals from <math>1</math> to <math>N</math>.
  −
The initial genotype probability is given by:
  −
 
  −
<math>
  −
  \begin{align}
  −
P(G_{i,j})^{(0)} = f_{i,j}^{(0)} = \frac{2}{n(n+1)}
  −
  \end{align}
  −
</math>
  −
 
  −
The E step equates the expectation of the genotype <math>G_{i,j}</math> for individual k as:
  −
  −
<math>
  −
  \begin{align}
  −
E[G_{i,j}|R_{k}]^{(l)}=P(G_{i,j}|R_{k})^{(l)}
  −
  \end{align}
  −
</math>
  −
 
  −
The M step estimates the genotype frequency using the individual expected genotype counts:
  −
  −
<math>
  −
  \begin{align}
  −
P(G_{i,j})^{(l)} = f_{i,j}^{(l)} = \frac{1}{N}\sum_{k}{E[G_{i,j}|R_{k}]}^{(l)}
  −
  \end{align}
  −
</math>
  −
 
  −
This is repeated till the appropriate convergence criteria is achieved.
  −
 
  −
=== Estimation of Genotype Frequencies assuming HWE ===
  −
 
  −
In order to estimate allele frequencies under HWE assumption, the E step estimates the individual expected posterior allele count for each individual. 
  −
 
  −
<math>
  −
  \begin{align}
  −
E[I|R_{k}]^{(l)}=P(G_{i,i}|R_{k})^{(l)} + 0.5P(G_{i,j}|R_{k})^{(l)}
  −
  \end{align}
  −
</math>
  −
 
  −
In the M step, the posterior genotype frequencies are derived from the computed genotype allele frequencies obtained in the E step assuming HWE. 
  −
 
  −
<math>
  −
  \begin{align}
  −
P(I)^{(l)} =  \frac{1}{N}\sum_{k}{E[I|R_{k}]}^{(l)}
  −
  \end{align}
  −
</math>
  −
 
  −
<math>
  −
  P(G_{i,j})^{(l)}  = \begin{cases}
  −
                      (P(I)^{(l)})^2, &  \text{if }i=j \\
  −
          2P(I)^{(l)}P(J)^{(l)},  & \text{if }i \ne j
  −
                                    \end{cases}
  −
</math>
  −
 
  −
This is repeated till the appropriate convergence criteria is achieved.
  −
 
  −
=== Used in ===
  −
 
  −
[[HWEP|Hardy-Weinberg Likelihood Test statistic]] and [[FIC| Inbreeding Coefficient]]
  −
 
  −
=== Derivation ===
  −
 
  −
Adrian with much help from Hyun.
  −
 
  −
=== Maintained by  ===
  −
 
  −
This page is maintained by  [mailto:atks@umich.edu Adrian].
 
1,102

edits

Navigation menu