# Changes

,  13:28, 20 May 2019
Tag category
Line 1: Line 1:
==INTRODUCTION==

==INTRODUCTION==
The key idea behind meta-analysis with RAREMETAL is that various gene-level test statistics can be reconstructed from single variant score statistics and that, when the linkage disequilibrium relationships between variants are known, the distribution of these gene-level statistics can be derived and used to evaluate signifi-cance. Single variant statistics are calculated using the Cochran-Mantel-Haenszel method. The main formulae are tabulated in the following:
+
The key idea behind meta-analysis with RAREMETAL is that various gene-level test statistics can be reconstructed from single variant score statistics and that, when the linkage disequilibrium relationships between variants are known, the distribution of these gene-level statistics can be derived and used to evaluate signifi-cance. Single variant statistics are calculated using the Cochran-Mantel-Haenszel method. Our method has been published in [http://www.ncbi.nlm.nih.gov/pubmed/24336170 '''Liu et. al''']. The main formulae are tabulated in the following:

==KEY FORMULAE==

==KEY FORMULAE==
Line 18: Line 18:
$S$ is the number of studies

$S$ is the number of studies
+
+
$f_{i}$ is the pooled allele frequency of $i^{th}$ variant
+
+
$f_{i,k}$ is the allele frequency of $i^{th}$ variant in $k^{th}$ study
+
+
${\delta_{k}}$ is the deviation of trait value of $k^{th}$ study

$\mathbf{w^T} = (w_1,w_2,...,w_m)^T$ is the vector of weights for $m$ rare variants in a gene.

$\mathbf{w^T} = (w_1,w_2,...,w_m)^T$ is the vector of weights for $m$ rare variants in a gene.
Line 28: Line 34:
and its variance

and its variance
−
$V_{meta_i}=\sum_{k=1}^S{V_{ii,k}}$
+
$V_{meta_i}=\sum_{k=1}^S{V_{ii,k}}$.

Then the score test statistics for the $i^{th}$ variant $T_{meta_i}$ asymptotically follows standard normal distribution

Then the score test statistics for the $i^{th}$ variant $T_{meta_i}$ asymptotically follows standard normal distribution
−
$T_{meta_i}=U_{meta_i}\bigg/\sqrt{V_{meta_i}}=\sum_{k=1}^S {U_{i,k}}\bigg/\sqrt{\sum_{k=1}^S{V_{ii,k}}} \sim\mathbf{N}(0,1)$
+
$T_{meta_i}=U_{meta_i}\bigg/\sqrt{V_{meta_i}}=\sum_{k=1}^S {U_{i,k}}\bigg/\sqrt{\sum_{k=1}^S{V_{ii,k}}} \sim\mathbf{N}(0,1)$.
+

+

+
'''Optimized method for unbalanced studies (--useExact)''':
+

+
$U_{meta_i}=\sum_{k=1}^S {U_{i,k}/\hat{\Omega_{k}}}-\sum_{k=1}^S{2n_{k}{\delta_{k}^{2}(f_{i}-f_{i,k})}}$
+

+
$V_{meta_i}={\sigma^{2}}\sum_{k=1}^S{(V_{ii,k}{\Omega_{k}}-4n_{k}(ff'-f_{k}f_{k}'))}$
+

+
${\sigma^{2}}=\sum_{k=1}^S{((n_{k}-1){\Omega_{k}}+n_{k}{\delta_{k}^{2}})}/(n-1)$

===BURDEN META ANALYSIS===

===BURDEN META ANALYSIS===
−
Burden test has been shown to be powerful detecting a group of rare variants that are unidirectional in effects. Once single variant meta analysis statistics are constructed, burden test score statistic can be easily reconstructed as
+
Burden test has been shown to be powerful detecting a group of rare variants that are unidirectional in effects. Once single variant meta analysis statistics are constructed, burden test score statistic for a gene can be easily reconstructed as
+

+
$T_{meta_{burden}}=\mathbf{w^TU_{meta}}\bigg/\sqrt{\mathbf{w^TV_{meta}w}} \sim\mathbf{N}(0,1)$,
−
$T_{meta_{burden}}=\mathbf{w^TU_{meta}}\bigg/\sqrt{\mathbf{w^TV_{meta}w}} \sim\mathbf{N}(0,1)$.
+
where $\mathbf{U_{meta}} = (U_{meta_1},U_{meta_2},...,U_{meta_m})^T</math> and [itex] \mathbf{V_{meta}}=cov(\mathbf{U_{meta}})$, representing a vector of single variant meta-analysis scores of $m$ variants in a gene and the covariance matrix of the scores across $m$ variants.

===VT META ANALYSIS===

===VT META ANALYSIS===
Line 44: Line 61:
Including variants that are not associated to phenotype can hurt power. Variable threshold test is designed to choose the optimal allele frequency threshold amongst rare variants in a gene, to gain power. The test statistic is defined as the maximum burden score statistic calculated using every possible frequency threshold

Including variants that are not associated to phenotype can hurt power. Variable threshold test is designed to choose the optimal allele frequency threshold amongst rare variants in a gene, to gain power. The test statistic is defined as the maximum burden score statistic calculated using every possible frequency threshold
−
$T_{VT}=\max(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)})$,
−
where the burden test statistic under any allele frequency threshold can be constructed from single variant meta-analysis statistics using
+
$T_{meta_{VT}}=\max(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)})$,
+

+
where $T_{b\left(f_i\right)}$ is the burden test statistic under allele frequency threshold $f_i$, and can be constructed from single variant meta-analysis statistics using
+

$T_{b\left(f_j\right)}=\boldsymbol{\phi}_{f_j}^\mathbf{T}\mathbf{U_{meta}}\bigg/\sqrt{\boldsymbol{\phi}_{f_j}^\mathbf{T}\mathbf{V_{meta}}\boldsymbol{\phi}_{f_j}}$,

$T_{b\left(f_j\right)}=\boldsymbol{\phi}_{f_j}^\mathbf{T}\mathbf{U_{meta}}\bigg/\sqrt{\boldsymbol{\phi}_{f_j}^\mathbf{T}\mathbf{V_{meta}}\boldsymbol{\phi}_{f_j}}$,
+

where $j$ represents any allele frequency in a group of rare variants, $\boldsymbol{\phi}_{f_j}$ is a vector of 0 and 1, indicating if a variant is included in the analysis using frequency threshold $f_i$.

where $j$ represents any allele frequency in a group of rare variants, $\boldsymbol{\phi}_{f_j}$ is a vector of 0 and 1, indicating if a variant is included in the analysis using frequency threshold $f_i$.
−
As described by Lin et. al, the p-value of this test can be calculated analytically using the fact that the burden test statistics together follow a multivariate normal distribution with mean $\mathbf{0}$ and covariance $\boldsymbol{\Omega}$, written as
+

+
As described by [http://www.ncbi.nlm.nih.gov/pubmed/21885029 '''Lin et. al'''], the p-value of this test can be calculated analytically using the fact that the burden test statistics together follow a multivariate normal distribution with mean $\mathbf{0}$ and covariance $\boldsymbol{\Omega}$, written as
+

$\left(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)}\right)$$\sim\mathbf{MVN}\left(\mathbf{0},\boldsymbol{\Omega}\right)$,

$\left(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)}\right)$$\sim\mathbf{MVN}\left(\mathbf{0},\boldsymbol{\Omega}\right)$,
−
where $\boldsymbol{\Omega_{ij}}=\frac{\boldsymbol{\phi}_{f_i}^T\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\boldsymbol{\phi}_{f_j}}{\sqrt{\boldsymbol{\phi}_{f_i}^T\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\boldsymbol{\phi}_{f_i}}\sqrt{\boldsymbol{\phi}_{f_j}^T\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\boldsymbol{\phi}_{f_j}}}$
+

+
where $\boldsymbol{\Omega_{ij}}=\frac{\boldsymbol{\phi}_{f_i}^T\mathbf{V_{meta}}\boldsymbol{\phi}_{f_j}}{\sqrt{\boldsymbol{\phi}_{f_i}^T\mathbf{V_{meta}}\boldsymbol{\phi}_{f_i}}\sqrt{\boldsymbol{\phi}_{f_j}^T\mathbf{V_{meta}}\boldsymbol{\phi}_{f_j}}}$.

===SKAT META ANALYSIS===

===SKAT META ANALYSIS===
Line 71: Line 94:       −

+
[[Category:RAREMETAL]]
−
|+'''Formulae for RAREMETAL'''
−
! scope="col" width="120pt" | Test
−
! scope="col" width="50pt" | Statistics
−
! scope="col" width="225pt" | Null Distribution
−
! scope="col" width="225pt" | Notation
−
|-
−
| Single Variant  || $T=\sum_{i=1}^n {U_i}\bigg/\sqrt{\sum_{i=1}^n{V_i}}$ || $T\sim\mathbf{N}(0,1)$ ||$U_i \text{ is the score statistic from study }i;$$V_i \text{ is the variance of } U_i.$
−
|-
−
| un-weighted Burden      || $T_b=\sum_{i=1}^n{\mathbf{U_i}}\Big/\sqrt{\sum_{i=1}^n{\mathbf{V_i}}}$ || $T_b\sim\mathbf{N}(0,1)$ ||$\mathbf{U_i}\text{ is the vector of score statistics from study }i, or$ $\mathbf{U_i}=\{U_{i1},...,U_{im}\};$ $\mathbf{V_i} \text{ is the covariance of } \mathbf{U_i}.$
−
|-
−
| Weighted Burden || $T_{wb}=\mathbf{w^T}\sum_{i=1}^n{\mathbf{U_i}}\bigg/\sqrt{\mathbf{w^T}\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\mathbf{w}}$  || $T_{wb}\sim\mathbf{N}(0,1)$ || $\mathbf{w^T}=\{w_1,w_2,...,w_m\}^T \text{ is the weight vector.}$
−
|-style="height: 50pt;"
−
| VT || $T_{VT}=\max(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)}),\text{ where}$$T_{b\left(f_j\right)}=\boldsymbol{\phi}_{f_j}^\mathbf{T}\sum_{i=1}^n{\mathbf{U_i}}\bigg/\sqrt{\boldsymbol{\phi}_{f_j}^\mathbf{T}\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\boldsymbol{\phi}_{f_j}}$ ||$\left(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)}\right)$$\sim\mathbf{MVN}\left(\mathbf{0},\boldsymbol{\Omega}\right)\text{,}$$\text{where }\boldsymbol{\Omega_{ij}}=\frac{\boldsymbol{\phi}_{f_i}^T\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\boldsymbol{\phi}_{f_j}}{\sqrt{\boldsymbol{\phi}_{f_i}^T\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\boldsymbol{\phi}_{f_i}}\sqrt{\boldsymbol{\phi}_{f_j}^T\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\boldsymbol{\phi}_{f_j}}}$ ||  $\boldsymbol{\phi}_{f_j}\text{ is a vector of } 0 \text{s and } 1\text{s,}$ $\text{indicating the inclusion of a variant using threshold }f_j;$
−
|-
−
| SKAT || $\mathbf{Q}=\left(\sum_{i=1}^n{\mathbf{U_i^T}}\right) \mathbf{W}\left(\sum_{i=1}^n{\mathbf{U_i}}\right)$ ||$\mathbf{Q}\sim\sum_{i=1}^m{\lambda_i\chi_{1,i}^2},\text{ where}$ $\left(\lambda_1,\lambda_2,\dots,\lambda_m\right)\text{ are eigen values of}$$\left(\sum_{i=1}^n{\mathbf{V_i}}\right)^\frac{1}{2}\mathbf{W}\left(\sum_{i=1}^n{\mathbf{V_i}}\right)^\frac{1}{2}$ || $\mathbf{W}\text{ is a diagonal matrix of weights.}$
−
|}

32

edits