From Genome Analysis Wiki
Jump to navigationJump to search
17 bytes added
, 11:16, 11 March 2014
Line 7: |
Line 7: |
| we use a variance component model to handle familial relationships. In a sample of n individuals, we model the observed phenotype vector ('''y''') as a sum of covariate effects (specified by a design matrix '''X''' and a vector of covariate effects '''β'''), additive genetic effects (modeled in vector '''g''') and non-shared environmental effects (modeled in vector '''ε'''). Thus the null model is: | | we use a variance component model to handle familial relationships. In a sample of n individuals, we model the observed phenotype vector ('''y''') as a sum of covariate effects (specified by a design matrix '''X''' and a vector of covariate effects '''β'''), additive genetic effects (modeled in vector '''g''') and non-shared environmental effects (modeled in vector '''ε'''). Thus the null model is: |
| | | |
− | <math>\mathbf{y}=\mathbf{X}\beta +\mathbf{g}+ \boldsymbol{\varepsilon}</math> | + | <math>\mathbf{y}=\mathbf{X}\boldsymbol{\beta} +\mathbf{g}+ \boldsymbol{\varepsilon}</math> |
| | | |
| We assume that genetic effects are normally distributed, with mean <math>\mathbf{0}</math> and covariance <math>\mathbf{K}\sigma_g^2</math> where the matrix <math>\mathbf{K}</math> summarizes kinship coefficients between sampled individuals and <math>\sigma_g^2</math> is a positive scalar describing the genetic contribution to the overall variance. We assume that non-shared environmental effects are normally distributed with mean <math>\mathbf{0}</math> and covariance <math>\mathbf{I}\sigma_e^2</math>, where <math>\mathbf{I}</math> is the identity matrix. | | We assume that genetic effects are normally distributed, with mean <math>\mathbf{0}</math> and covariance <math>\mathbf{K}\sigma_g^2</math> where the matrix <math>\mathbf{K}</math> summarizes kinship coefficients between sampled individuals and <math>\sigma_g^2</math> is a positive scalar describing the genetic contribution to the overall variance. We assume that non-shared environmental effects are normally distributed with mean <math>\mathbf{0}</math> and covariance <math>\mathbf{I}\sigma_e^2</math>, where <math>\mathbf{I}</math> is the identity matrix. |
| | | |
| To estimate <math>\mathbf{K}</math>, we either use known pedigree structure to define <math>\mathbf{K}</math> or else use the empirical estimator <math>\mathbf{K}=\frac{1}{l}\sum_{i=1}^l{(G_i-2f_i\mathbf{1})(G_i-2f_i\mathbf{1})\over 4f_i(1-f_i)} </math>, | | To estimate <math>\mathbf{K}</math>, we either use known pedigree structure to define <math>\mathbf{K}</math> or else use the empirical estimator <math>\mathbf{K}=\frac{1}{l}\sum_{i=1}^l{(G_i-2f_i\mathbf{1})(G_i-2f_i\mathbf{1})\over 4f_i(1-f_i)} </math>, |
− | where <math>l</math> is the count of variants, <math>G_i</math> and <math>f_i</math> are the genotype vector and estimated allele frequency for the <math>i^{th}</math> variant, respectively. Each element in <math>G_i</math> encodes the minor allele count for one individual. Model parameters <math>\hat{\mathbf{\beta}}</math>, <math>\hat{\sigma_g^2}</math> and <math>\hat{\sigma_e^2}</math>, are estimated using maximum likelihood and the efficient algorithm described in Lippert et. al. For convenience, let the estimated covariance matrix of <math>\mathbf{y}</math> be <math>\mathbf{\Omega}=2\sigma_g^2\mathbf{K}+\sigma_e^2\mathbf{I}</math>. | + | where <math>l</math> is the count of variants, <math>G_i</math> and <math>f_i</math> are the genotype vector and estimated allele frequency for the <math>i^{th}</math> variant, respectively. Each element in <math>G_i</math> encodes the minor allele count for one individual. Model parameters <math>\hat{\boldsymbol{\beta}}</math>, <math>\hat{\sigma_g^2}</math> and <math>\hat{\sigma_e^2}</math>, are estimated using maximum likelihood and the efficient algorithm described in Lippert et. al. For convenience, let the estimated covariance matrix of <math>\mathbf{y}</math> be <math>\mathbf{\Omega}=2\sigma_g^2\mathbf{K}+\sigma_e^2\mathbf{I}</math>. |
| | | |
| == Single Variant Score Tests == | | == Single Variant Score Tests == |