Line 1: |
Line 1: |
| + | [[Category:RAREMETALWORKER]] |
| + | ==Useful Links== |
| + | |
| + | Here are some useful links to key pages: |
| + | * The [[RAREMETALWORKER | '''RAREMETALWORKER documentation''']] |
| + | * The [[RAREMETALWORKER_command_reference | '''RAREMETALWORKER command reference''']] |
| + | * The [[RAREMETALWORKER_SPECIAL_TOPICS | '''RAREMETALWORKER special topics''']] |
| + | * The [[Tutorial:_RAREMETAL | '''RAREMETALWORKER quick start tutorial''']] |
| + | * The [[RAREMETAL_method | '''RAREMETAL method''']] |
| + | * The [[RAREMETAL_FAQ | '''FAQ''']] |
| + | |
| == Brief Introduction== | | == Brief Introduction== |
| | | |
Line 10: |
Line 21: |
| We use the following notations to describe our methods: | | We use the following notations to describe our methods: |
| | | |
− | <math>\mathbf{y}</math> is the observed phenotype vector | + | <math>\mathbf{y}</math> is the vector of observed quantitative trait |
| | | |
| <math>\mathbf{X}</math> is the design matrix | | <math>\mathbf{X}</math> is the design matrix |
Line 26: |
Line 37: |
| <math>\boldsymbol{\varepsilon}</math> is the non-shared environmental effects | | <math>\boldsymbol{\varepsilon}</math> is the non-shared environmental effects |
| | | |
− | ===SUMMARY STATISTICS AND COVARIANCE MATRICES=== | + | <math> \hat{\boldsymbol{\Omega}} </math> is the estimated covariance matrix of <math>\mathbf{y}</math> |
| + | |
| + | <math>\mathbf{K}</math> is the kinship matrix |
| + | |
| + | <math>\mathbf{K_X}</math> is the kinship matrix of Chromosome X |
| + | |
| + | <math> \sigma_g^2 </math> is the genetic component |
| + | |
| + | <math> {{\sigma_g}_X}^2 </math> is the genetic component for markers on chromosome X |
| + | |
| + | <math>\sigma_e^2 </math> is the non-shared-environment component. |
| + | |
| + | ===SINGLE VARIANT SCORE TEST=== |
| | | |
| We used the following model for the trait: | | We used the following model for the trait: |
Line 32: |
Line 55: |
| <math> \mathbf{y}=\mathbf{X}\boldsymbol{\beta_c}+\beta_i(\mathbf{G_i}-\bar{\mathbf{G_i}})+\mathbf{g}+\boldsymbol{\varepsilon} </math>. | | <math> \mathbf{y}=\mathbf{X}\boldsymbol{\beta_c}+\beta_i(\mathbf{G_i}-\bar{\mathbf{G_i}})+\mathbf{g}+\boldsymbol{\varepsilon} </math>. |
| | | |
− | Here, [explain the formula]. | + | Here, the quantitive trait for an individual is a sum of covariate effects, additive genetic effect from the <math> i^{th} </math> variant and the polygenic background effects together with non-shared environmental effect. |
| | | |
| In this model, <math>\beta_i</math> is to measure the additive genetic effect of the <math>i^{th}</math> variant. As usual, the score statistic for testing <math>H_0:\beta_i=0</math> is: | | In this model, <math>\beta_i</math> is to measure the additive genetic effect of the <math>i^{th}</math> variant. As usual, the score statistic for testing <math>H_0:\beta_i=0</math> is: |
Line 44: |
Line 67: |
| The score test statistic, <math>T_i=(U_i^2)/V_{ii}</math>, is asymptotically distributed as chi-squared with one degree of freedom. The score test p-value is reported in RAREMETALWORKER. | | The score test statistic, <math>T_i=(U_i^2)/V_{ii}</math>, is asymptotically distributed as chi-squared with one degree of freedom. The score test p-value is reported in RAREMETALWORKER. |
| | | |
− | == Summary Statistics and Covariance Matrices== | + | ===SUMMARY STATISTICS AND COVARIANCE MATRICES=== |
| | | |
| RAREMETALWORKER automatically stores the score statistics for each marker ( <math> U_i </math>) together with quality information of that marker, including HWE p-value, call rate, and allele counts. | | RAREMETALWORKER automatically stores the score statistics for each marker ( <math> U_i </math>) together with quality information of that marker, including HWE p-value, call rate, and allele counts. |
| | | |
− | RAREMETALWORKER also stores the covariance matrices (<math> \mathbf{V} </math>) of the score statistics of markers within a window. | + | RAREMETALWORKER also stores the covariance matrices (<math> \mathbf{V} </math>) of the score statistics of markers within a window, size of which can be specified through command line. |
| | | |
− | == Modeling Relatedness == | + | === MODELING RELATEDNESS === |
− | we use a variance component model to handle familial relationships. The null model is:
| + | We use a variance component model to handle familial relationships. We estimate the variance components under the null model: |
| | | |
| <math>\mathbf{y}=\mathbf{X}\boldsymbol{\beta} +\mathbf{g}+ \boldsymbol{\varepsilon}</math> | | <math>\mathbf{y}=\mathbf{X}\boldsymbol{\beta} +\mathbf{g}+ \boldsymbol{\varepsilon}</math> |
Line 62: |
Line 85: |
| <math>\mathbf{K}=\frac{1}{l}\sum_{i=1}^l{(G_i-2f_i\mathbf{1})(G_i-2f_i\mathbf{1})\over 4f_i(1-f_i)} </math>, | | <math>\mathbf{K}=\frac{1}{l}\sum_{i=1}^l{(G_i-2f_i\mathbf{1})(G_i-2f_i\mathbf{1})\over 4f_i(1-f_i)} </math>, |
| | | |
− | where <math>l</math> is the count of variants, <math>G_i</math> and <math>f_i</math> are the genotype vector and estimated allele frequency for the <math>i^{th}</math> variant, respectively. Each element in <math>G_i</math> encodes the minor allele count for one individual. Model parameters <math>\hat{\boldsymbol{\beta}}</math>, <math>\hat{\sigma_g^2}</math> and <math>\hat{\sigma_e^2}</math>, are estimated using maximum likelihood and the efficient algorithm described in [http://www.nature.com/nmeth/journal/v8/n10/full/nmeth.1681.html Lippert et. al]. For convenience, let the estimated covariance matrix of <math>\mathbf{y}</math> be <math>\hat{\boldsymbol{\Omega}}=2\hat{\sigma_g^2}\mathbf{K}+\hat{\sigma_e^2}\mathbf{I}</math>. | + | where <math>l</math> is the count of variants, <math>G_i</math> and <math>f_i</math> are the genotype vector and estimated allele frequency for the <math>i^{th}</math> variant, respectively. Each element in <math>G_i</math> encodes the minor allele count for one individual. Model parameters <math>\hat{\boldsymbol{\beta}}</math>, <math>\hat{\sigma_g^2}</math> and <math>\hat{\sigma_e^2}</math>, are estimated using maximum likelihood and the efficient algorithm described in [http://www.nature.com/nmeth/journal/v8/n10/full/nmeth.1681.html Lippert et. al]. For convenience, let the estimated covariance matrix of <math>\mathbf{y}</math> be <math>\hat{\boldsymbol{\Omega}}=\hat{\sigma_g^2}\mathbf{K}+\hat{\sigma_e^2}\mathbf{I}</math>. |
| | | |
− | ==Chromosome X== | + | ===ANALYZING MARKERS ON CHROMOSOME X=== |
| | | |
− | To analyze markers on chromosome X, we fit an extra variance components <math> {{\sigma_g}_X}^2 </math>, to model the variance explained by chromosome X. A kinship for chromosome X, <math> \boldsymbol{K_X} </math>, can be estimated either from a pedigree, or from genotypes of marker from chromosome X. Then the estimated covariance matrix can be written as <math>\hat{\boldsymbol{\Omega}}=2\hat{\sigma_g^2}\mathbf{K}+2\hat{{\sigma_g}_X^2}\mathbf{K_X}+\hat{\sigma_e^2}\mathbf{I}</math>. | + | To analyze markers on chromosome X, we fit an extra variance components <math> {{\sigma_g}_X}^2 </math>, to model the variance explained by chromosome X. A kinship for chromosome X, <math> \boldsymbol{K_X} </math>, can be estimated either from a pedigree, or from genotypes of marker from chromosome X. Then the estimated covariance matrix can be written as <math>\hat{\boldsymbol{\Omega}}=\hat{\sigma_g^2}\mathbf{K}+\hat{{\sigma_g}_X^2}\mathbf{K_X}+\hat{\sigma_e^2}\mathbf{I}</math>. |