Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Created page with '=== Introduction === Inbreeding Coefficients are an important statistic in the study of genetic variants. This page details EM algorithms to estimate inbreeding coefficients fro…'
=== Introduction ===

Inbreeding Coefficients are an important statistic in the study of genetic variants. This page details EM algorithms to estimate inbreeding coefficients from genotype likelihoods in NGS data.

=== Formulation ===

The inbreeding coefficient <math>F_{IC}</math> is a measure of deviation from the Hardy Weinberg Equilibrium in terms of the excess of heterozygotes observed.
A value of 0 implies no deviation, a negative value implies an excess of heterozygotes and a positive value implies an excess of homozygotes. <math>F_{IC}</math> ranges from -1 to 1.


The following equation gives the estimate of F where the observed genotypes are available. <math>g_{i,j,k}</math> is the genotype composed of alleles <math>i</math> and <math>j</math> for the <math>k</math>th individual.[[ AF|<math>P(G_{i,j}|\textbf{p})</math>]] is the [[AF|estimated genotype allele frequency]] for genotype <math>G_{i,j}</math> under HWE assumption. <math>I[i \ne j]</math> is an indicator function for heterozygote genotypes.

<math>
\begin{align}
F_{IC} & = 1 - \frac{O[Het]}{E[Het|\textbf{p}]} \\
& = 1 - \frac{\sum_{i,j,k}{g_{i,j,k}I[i \ne j]}}{{\sum_{i,j}{P(G_{i,j}|\textbf{p})I[i \ne j]}}} \\
\end{align}
</math>

The following equation gives the estimate of F where genotype likelihoods are available. <math>P(R_{k} |G_{i,j})</math> is the genotype likelihood for individual <math>k</math> given genotype <math>G_{i,j}</math>. This is basically the probability of observing the reads in individual <math>k</math> assuming <math>G_{i,j}</math> is the underlying true genotype for that particular locus.

<math>
\begin{align}
F_{IC} & = 1 - \frac{O[Het]}{E[Het|\textbf{p}]} \\
& = 1 - \frac{\sum_{i,j,k}{P(G_{i,j}|R_k , \textbf{p})I[i \ne j]}} {{\sum_{i,j}{P(G_{i,j}|\textbf{p})I[i \ne j]}}} \\
& = 1 - \frac{\sum_{i,j,k}{\frac{P(R_k|G_{i,j})P(G_{i,j}|\textbf{p})}{\sum_{i',j'}{P(R_k|G_{i',j'})P(G_{i',j'}|\textbf{p})}}}I[i \ne j]}
{\sum_{i,j}{P(G_{i,j}|\textbf{p})I[i \ne j]}} \\
\end{align}
</math>

=== Derivation ===

Adrian with much help from Hyun.

=== Maintained by ===

This page is maintained by [mailto:atks@umich.edu Adrian].
1,102

edits

Navigation menu