# RAREMETAL METHOD

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

## INTRODUCTION

The key idea behind meta-analysis with RAREMETAL is that various gene-level test statistics can be reconstructed from single variant score statistics and that, when the linkage disequilibrium relationships between variants are known, the distribution of these gene-level statistics can be derived and used to evaluate signifi-cance. Single variant statistics are calculated using the Cochran-Mantel-Haenszel method. The main formulae are tabulated in the following:

## KEY FORMULAE

### NOTATIONS

We denote the following to describe our methods:

$U_{i,k}$ is the score statistic for the $i^{th}$ variant from the $k^{th}$ study

$V_{ij,k}$ is the covariance of the score statistics between the $i^{th}$ and the $j^{th}$ variant from the $k^{th}$ study

$U_{i,k}$ and $V_{ij,k}$ are described in detail in RAREMETALWORKER method.

$\mathbf {U_{k}}$ is the vector of score statistics of rare variants in a gene from the $k^{th}$ study.

$\mathbf {V_{k}}$ is the variance-covariance matrix of score statistics of rare variants in a gene from the $k^{th}$ study, or $\mathbf {V_{k}} =cov(\mathbf {U_{k}} )$ $S$ is the number of studies

$\mathbf {w^{T}} =(w_{1},w_{2},...,w_{m})^{T}$ is the vector of weights for $m$ rare variants in a gene.

### SINGLE VARIANT META ANALYSIS

Single variant meta-analysis score statistic can be reconstructed from score statistics and their variances generated by each study, assuming that samples are unrelated across studies. Define meta-analysis score statistics as

$U_{meta_{i}}=\sum _{k=1}^{S}{U_{i,k}}$ and its variance

$V_{meta_{i}}=\sum _{k=1}^{S}{V_{ii,k}}$ Then the score test statistics for the $i^{th}$ variant $T_{meta_{i}}$ asymptotically follows standard normal distribution

$T_{meta_{i}}=U_{meta_{i}}{\bigg /}{\sqrt {V_{meta_{i}}}}=\sum _{k=1}^{S}{U_{i,k}}{\bigg /}{\sqrt {\sum _{k=1}^{S}{V_{ii,k}}}}\sim \mathbf {N} (0,1)$ ### BURDEN META ANALYSIS

Burden test has been shown to be powerful detecting a group of rare variants that are unidirectional in effects. Once single variant meta analysis statistics are constructed, burden test score statistic can be easily reconstructed as

$T_{meta_{burden}}=\mathbf {w^{T}U_{meta}} {\bigg /}{\sqrt {\mathbf {w^{T}V_{meta}w} }}\sim \mathbf {N} (0,1)$ .

### VT META ANALYSIS

Including variants that are not associated to phenotype can hurt power. Variable threshold test is designed to choose the optimal allele frequency threshold amongst rare variants in a gene, to gain power. The test statistic is defined as the maximum burden score statistic calculated using every possible frequency threshold

$T_{VT}=\max(T_{b\left(f_{1}\right)},T_{b\left(f_{2}\right)},\dots ,T_{b\left(f_{m}\right)})$ ,

where the burden test statistic under any allele frequency threshold can be constructed from single variant meta-analysis statistics using

$T_{b\left(f_{j}\right)}={\boldsymbol {\phi }}_{f_{j}}^{\mathbf {T} }\mathbf {U_{meta}} {\bigg /}{\sqrt {{\boldsymbol {\phi }}_{f_{j}}^{\mathbf {T} }\mathbf {V_{meta}} {\boldsymbol {\phi }}_{f_{j}}}}$ ,

where $j$ represents any allele frequency in a group of rare variants, ${\boldsymbol {\phi }}_{f_{j}}$ is a vector of 0 and 1, indicating if a variant is included in the analysis using frequency threshold $f_{i}$ .

As described by Lin et. al, the p-value of this test can be calculated analytically using the fact that the burden test statistics together follow a multivariate normal distribution with mean $\mathbf {0}$ and covariance ${\boldsymbol {\Omega }}$ , written as

$\left(T_{b\left(f_{1}\right)},T_{b\left(f_{2}\right)},\dots ,T_{b\left(f_{m}\right)}\right)$ $\sim \mathbf {MVN} \left(\mathbf {0} ,{\boldsymbol {\Omega }}\right)$ ,

where ${\boldsymbol {\Omega _{ij}}}={\frac {{\boldsymbol {\phi }}_{f_{i}}^{T}\mathbf {V_{meta}} {\boldsymbol {\phi }}_{f_{j}}}{{\sqrt {{\boldsymbol {\phi }}_{f_{i}}^{T}\mathbf {V_{meta}} {\boldsymbol {\phi }}_{f_{i}}}}{\sqrt {{\boldsymbol {\phi }}_{f_{j}}^{T}\mathbf {V_{meta}} {\boldsymbol {\phi }}_{f_{j}}}}}}$ ### SKAT META ANALYSIS

SKAT is most powerful when detecting genes with rare variants having opposite directions in effect sizes. Meta-analysis statistic can also be re-constructed using single variant meta-analysis scores and their covariances

$\mathbf {Q} =\mathbf {{U_{meta}}^{T}} \mathbf {W} \mathbf {U_{meta}}$ ,

where $\mathbf {W}$ is a diagonal matrix of weights of rare variants included in a gene.

As shown in Wu et. al, the null distribution of the $\mathbf {Q}$ statistic follows a mixture chi-sqaured distribution described as

$\mathbf {Q} \sim \sum _{i=1}^{m}{\lambda _{i}\chi _{1,i}^{2}},$ where $\left(\lambda _{1},\lambda _{2},\dots ,\lambda _{m}\right)$ are eigen values of $\mathbf {V_{meta}^{\frac {1}{2}}} \mathbf {W} \mathbf {V_{meta}^{\frac {1}{2}}}$ .

Formulae for RAREMETAL
Test Statistics Null Distribution Notation
Single Variant $T=\sum _{i=1}^{n}{U_{i}}{\bigg /}{\sqrt {\sum _{i=1}^{n}{V_{i}}}}$ $T\sim \mathbf {N} (0,1)$ $U_{i}{\text{ is the score statistic from study }}i;$ $V_{i}{\text{ is the variance of }}U_{i}.$ un-weighted Burden $T_{b}=\sum _{i=1}^{n}{\mathbf {U_{i}} }{\Big /}{\sqrt {\sum _{i=1}^{n}{\mathbf {V_{i}} }}}$ $T_{b}\sim \mathbf {N} (0,1)$ $\mathbf {U_{i}} {\text{ is the vector of score statistics from study }}i,or$ $\mathbf {U_{i}} =\{U_{i1},...,U_{im}\};$ $\mathbf {V_{i}} {\text{ is the covariance of }}\mathbf {U_{i}} .$ Weighted Burden $T_{wb}=\mathbf {w^{T}} \sum _{i=1}^{n}{\mathbf {U_{i}} }{\bigg /}{\sqrt {\mathbf {w^{T}} \left(\sum _{i=1}^{n}{\mathbf {V_{i}} }\right)\mathbf {w} }}$ $T_{wb}\sim \mathbf {N} (0,1)$ $\mathbf {w^{T}} =\{w_{1},w_{2},...,w_{m}\}^{T}{\text{ is the weight vector.}}$ VT $T_{VT}=\max(T_{b\left(f_{1}\right)},T_{b\left(f_{2}\right)},\dots ,T_{b\left(f_{m}\right)}),{\text{ where}}$ $T_{b\left(f_{j}\right)}={\boldsymbol {\phi }}_{f_{j}}^{\mathbf {T} }\sum _{i=1}^{n}{\mathbf {U_{i}} }{\bigg /}{\sqrt {{\boldsymbol {\phi }}_{f_{j}}^{\mathbf {T} }\left(\sum _{i=1}^{n}{\mathbf {V_{i}} }\right){\boldsymbol {\phi }}_{f_{j}}}}$ $\left(T_{b\left(f_{1}\right)},T_{b\left(f_{2}\right)},\dots ,T_{b\left(f_{m}\right)}\right)$ $\sim \mathbf {MVN} \left(\mathbf {0} ,{\boldsymbol {\Omega }}\right){\text{,}}$ ${\text{where }}{\boldsymbol {\Omega _{ij}}}={\frac {{\boldsymbol {\phi }}_{f_{i}}^{T}\left(\sum _{i=1}^{n}{\mathbf {V_{i}} }\right){\boldsymbol {\phi }}_{f_{j}}}{{\sqrt {{\boldsymbol {\phi }}_{f_{i}}^{T}\left(\sum _{i=1}^{n}{\mathbf {V_{i}} }\right){\boldsymbol {\phi }}_{f_{i}}}}{\sqrt {{\boldsymbol {\phi }}_{f_{j}}^{T}\left(\sum _{i=1}^{n}{\mathbf {V_{i}} }\right){\boldsymbol {\phi }}_{f_{j}}}}}}$ ${\boldsymbol {\phi }}_{f_{j}}{\text{ is a vector of }}0{\text{s and }}1{\text{s,}}$ ${\text{indicating the inclusion of a variant using threshold }}f_{j};$ SKAT $\mathbf {Q} =\left(\sum _{i=1}^{n}{\mathbf {U_{i}^{T}} }\right)\mathbf {W} \left(\sum _{i=1}^{n}{\mathbf {U_{i}} }\right)$ $\mathbf {Q} \sim \sum _{i=1}^{m}{\lambda _{i}\chi _{1,i}^{2}},{\text{ where}}$ $\left(\lambda _{1},\lambda _{2},\dots ,\lambda _{m}\right){\text{ are eigen values of}}$ $\left(\sum _{i=1}^{n}{\mathbf {V_{i}} }\right)^{\frac {1}{2}}\mathbf {W} \left(\sum _{i=1}^{n}{\mathbf {V_{i}} }\right)^{\frac {1}{2}}$ $\mathbf {W} {\text{ is a diagonal matrix of weights.}}$ 