Difference between revisions of "FIC"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 1: Line 1:
Many data sets consist of individuals from different populations, in the cases of structured populations,
+
The inbreeding coefficient <math>F_{IC}</math> is a measure of deviation from the Hardy Weinberg Equilibrium in terms of the excess of heterozygotes observed.
this usually result in an increased number of homozygotes.
+
A value of 0 implies no deviation, a negative value implies an excess of heterozygotes and a positive value implies an excess of homozygotes. <math>F_{IC}</math> ranges from -1 to 1.
  
The inbreeding coefficient FIC is a measure of deviation away from the Hardy Weinberg Equilibrium. 
 
A value of 0 implies no deviation, a negative value implies an excess of heterozygotes and a positive value implies an excess of homozygotes.
 
 
The following equation gives the estimate of F where the observed genotypes are available.
 
  
 +
The following equation gives the estimate of F where the observed genotypes are available. <math>g_{i,j,k}</math> is the genotype composed of alleles <math>i</math> and <math>j</math> for the <math>k</math>th individual. <math>P(G_{i,j}|\textbf{p})</math> is the estimated genotype allele frequency for genotype <math>G_{i,j}</math> under HWE assumption.  <math>I[i \ne j]</math> is an identity function for heterozygote genotypes.
 +
       
 
<math>
 
<math>
 
\begin{align}
 
\begin{align}
        F_{IC} & = 1 - \frac{O(Het)}{E(Het)}  \\
+
F_{IC} & =   1 - \frac{O[Het]}{E[Het|\textbf{p}]}  \\
  & =  1 - \frac{\text{No. observed HETs}}{E(Het|\textbf{p)}}  \\
+
  & =  1 - \frac{\sum_{i,j,k}{g_{i,j,k}I[i \ne j]}}{{\sum_{i,j}{P(G_{i,j}|\textbf{p})I[i \ne j]}}}  \\
                & =  1 - \frac{\text{No. observed HETs}}{\sum_{i=1}^{n}{\sum_j{P(Het_j|\textbf{p})}}}  \\
 
 
 
 
\end{align}
 
\end{align}
 
</math>
 
</math>
       
+
   
The following equation gives the estimate of F where genotype likelihoods are available.
+
The following equation gives the estimate of F where genotype likelihoods are available. <math>P(R_{k} |G_{i,j})</math> is the genotype likelihood.  
  
 
<math>
 
<math>
 
\begin{align}
 
\begin{align}
F_{IC} & =  1 - \frac{O(Het)}{E(Het)}  \\
+
F_{IC} & =  1 - \frac{O[Het]}{E[Het|\textbf{p}]}  \\
  & = 1 - \frac{E(Het|R_i, \textbf{p})}{E(Het|\textbf{p})}  \\
+
   & = 1 - \frac{\sum_{i,j,k}{P(G_{i,j}|R_k , \textbf{p})I[i \ne j]}} {{\sum_{i,j}{P(G_{i,j}|\textbf{p})I[i \ne j]}}}    \\
   & = 1 - \frac{\sum_{i=1}^{n}{\sum_j{P(Het_j|R_i , \textbf{p})}}} {\sum_{i=1}^{n}{\sum_{j}{P(Het_j|\textbf{p})}}}    \\
+
   & = 1 - \frac{\sum_{i,j,k}{\frac{P(R_k|G_{i,j})P(G_{i,j}|\textbf{p})}{\sum_{i',j'}{P(R_k|G_{i',j'})P(G_{i',j'}|\textbf{p})}}}I[i \ne j]}
   & = 1 - \frac{\sum_{i=1}^{n}{\sum_j{\frac{P(R_i|Het_j,\textbf{p})P(Het_j|\textbf{p})}{\sum_{(k,l)}{P(R_i|G_{(k,l)},\textbf{p})P(G_{(k,l)}|\textbf{p})}}}}}
+
               {\sum_{i,j}{P(G_{i,j}|\textbf{p})I[i \ne j]}}  \\  
               {\sum_{i=1}^{n}{\sum_j{P(Het_j|\textbf{p})}}}  \\
 
 
\end{align}
 
\end{align}
 
</math>
 
</math>
  
where:
 
  
<math>
+
=== Derivation ===
\begin{align}
 
  
          P(G_{(k,l)}|\textbf{g}) & = & g_{(k,l)}
+
Adrian with much help from Hyun.
  
\end{align}
 
</math>
 
 
<math>
 
P(G_{(k,l)}|\textbf{p})  =
 
\begin{cases}
 
p_k^2, & \text{if }k=l \\
 
      2p_kp_l, & \text{if }k \ne l
 
\end{cases}
 
</math>
 
  
 
=== Maintained by  ===
 
=== Maintained by  ===
  
This page is maintained by  [mailto:atks@umich.edu Adrian] with much help from Hyun.
+
This page is maintained by  [mailto:atks@umich.edu Adrian].

Revision as of 13:33, 11 April 2013

The inbreeding coefficient is a measure of deviation from the Hardy Weinberg Equilibrium in terms of the excess of heterozygotes observed. A value of 0 implies no deviation, a negative value implies an excess of heterozygotes and a positive value implies an excess of homozygotes. ranges from -1 to 1.


The following equation gives the estimate of F where the observed genotypes are available. is the genotype composed of alleles and for the th individual. is the estimated genotype allele frequency for genotype under HWE assumption. is an identity function for heterozygote genotypes.

The following equation gives the estimate of F where genotype likelihoods are available. is the genotype likelihood.


Derivation

Adrian with much help from Hyun.


Maintained by

This page is maintained by Adrian.