# Difference between revisions of "RAREMETAL METHOD"

Shuang Feng (talk | contribs) (→VT META ANALYSIS) |
Shuang Feng (talk | contribs) (→VT META ANALYSIS) |
||

Line 45: | Line 45: | ||

Including variants that are not associated to phenotype can hurt power. Variable threshold test is designed to choose the optimal allele frequency threshold amongst rare variants in a gene, to gain power. The test statistic is defined as the maximum burden score statistic calculated using every possible frequency threshold | Including variants that are not associated to phenotype can hurt power. Variable threshold test is designed to choose the optimal allele frequency threshold amongst rare variants in a gene, to gain power. The test statistic is defined as the maximum burden score statistic calculated using every possible frequency threshold | ||

+ | |||

<math>T_{meta_{VT}}=\max(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)})</math>, | <math>T_{meta_{VT}}=\max(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)})</math>, | ||

where <math>T_{b\left(f_i\right)}</math> is the burden test statistic under allele frequency threshold <math>f_i</math>, and can be constructed from single variant meta-analysis statistics using | where <math>T_{b\left(f_i\right)}</math> is the burden test statistic under allele frequency threshold <math>f_i</math>, and can be constructed from single variant meta-analysis statistics using | ||

+ | |||

<math>T_{b\left(f_j\right)}=\boldsymbol{\phi}_{f_j}^\mathbf{T}\mathbf{U_{meta}}\bigg/\sqrt{\boldsymbol{\phi}_{f_j}^\mathbf{T}\mathbf{V_{meta}}\boldsymbol{\phi}_{f_j}} </math>, | <math>T_{b\left(f_j\right)}=\boldsymbol{\phi}_{f_j}^\mathbf{T}\mathbf{U_{meta}}\bigg/\sqrt{\boldsymbol{\phi}_{f_j}^\mathbf{T}\mathbf{V_{meta}}\boldsymbol{\phi}_{f_j}} </math>, | ||

+ | |||

where <math>j</math> represents any allele frequency in a group of rare variants, <math>\boldsymbol{\phi}_{f_j}</math> is a vector of 0 and 1, indicating if a variant is included in the analysis using frequency threshold <math>f_i</math>. | where <math>j</math> represents any allele frequency in a group of rare variants, <math>\boldsymbol{\phi}_{f_j}</math> is a vector of 0 and 1, indicating if a variant is included in the analysis using frequency threshold <math>f_i</math>. | ||

+ | |||

As described by [http://www.ncbi.nlm.nih.gov/pubmed/21885029 '''Lin et. al'''], the p-value of this test can be calculated analytically using the fact that the burden test statistics together follow a multivariate normal distribution with mean <math>\mathbf{0}</math> and covariance <math>\boldsymbol{\Omega}</math>, written as | As described by [http://www.ncbi.nlm.nih.gov/pubmed/21885029 '''Lin et. al'''], the p-value of this test can be calculated analytically using the fact that the burden test statistics together follow a multivariate normal distribution with mean <math>\mathbf{0}</math> and covariance <math>\boldsymbol{\Omega}</math>, written as | ||

+ | |||

<math> \left(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)}\right)</math><math>\sim\mathbf{MVN}\left(\mathbf{0},\boldsymbol{\Omega}\right) </math>, | <math> \left(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)}\right)</math><math>\sim\mathbf{MVN}\left(\mathbf{0},\boldsymbol{\Omega}\right) </math>, | ||

+ | |||

where <math>\boldsymbol{\Omega_{ij}}=\frac{\boldsymbol{\phi}_{f_i}^T\mathbf{V_{meta}}\boldsymbol{\phi}_{f_j}}{\sqrt{\boldsymbol{\phi}_{f_i}^T\mathbf{V_{meta}}\boldsymbol{\phi}_{f_i}}\sqrt{\boldsymbol{\phi}_{f_j}^T\mathbf{V_{meta}}\boldsymbol{\phi}_{f_j}}}</math>. | where <math>\boldsymbol{\Omega_{ij}}=\frac{\boldsymbol{\phi}_{f_i}^T\mathbf{V_{meta}}\boldsymbol{\phi}_{f_j}}{\sqrt{\boldsymbol{\phi}_{f_i}^T\mathbf{V_{meta}}\boldsymbol{\phi}_{f_i}}\sqrt{\boldsymbol{\phi}_{f_j}^T\mathbf{V_{meta}}\boldsymbol{\phi}_{f_j}}}</math>. |

## Revision as of 23:25, 8 April 2014

## Contents

## INTRODUCTION

The key idea behind meta-analysis with RAREMETAL is that various gene-level test statistics can be reconstructed from single variant score statistics and that, when the linkage disequilibrium relationships between variants are known, the distribution of these gene-level statistics can be derived and used to evaluate signifi-cance. Single variant statistics are calculated using the Cochran-Mantel-Haenszel method. Our method has been published in **Liu et. al**. The main formulae are tabulated in the following:

## KEY FORMULAE

### NOTATIONS

We denote the following to describe our methods:

is the score statistic for the variant from the study

is the covariance of the score statistics between the and the variant from the study

and are described in detail in **RAREMETALWORKER method**.

is the vector of score statistics of rare variants in a gene from the study.

is the variance-covariance matrix of score statistics of rare variants in a gene from the study, or

is the number of studies

is the vector of weights for rare variants in a gene.

### SINGLE VARIANT META ANALYSIS

Single variant meta-analysis score statistic can be reconstructed from score statistics and their variances generated by each study, assuming that samples are unrelated across studies. Define meta-analysis score statistics as

and its variance

.

Then the score test statistics for the variant asymptotically follows standard normal distribution

.

### BURDEN META ANALYSIS

Burden test has been shown to be powerful detecting a group of rare variants that are unidirectional in effects. Once single variant meta analysis statistics are constructed, burden test score statistic for a gene can be easily reconstructed as

,

where and , representing a vector of single variant meta-analysis scores of variants in a gene and the covariance matrix of the scores across variants.

### VT META ANALYSIS

Including variants that are not associated to phenotype can hurt power. Variable threshold test is designed to choose the optimal allele frequency threshold amongst rare variants in a gene, to gain power. The test statistic is defined as the maximum burden score statistic calculated using every possible frequency threshold

,

where is the burden test statistic under allele frequency threshold , and can be constructed from single variant meta-analysis statistics using

,

where represents any allele frequency in a group of rare variants, is a vector of 0 and 1, indicating if a variant is included in the analysis using frequency threshold .

As described by **Lin et. al**, the p-value of this test can be calculated analytically using the fact that the burden test statistics together follow a multivariate normal distribution with mean and covariance , written as

,

where .

### SKAT META ANALYSIS

SKAT is most powerful when detecting genes with rare variants having opposite directions in effect sizes. Meta-analysis statistic can also be re-constructed using single variant meta-analysis scores and their covariances

,

where is a diagonal matrix of weights of rare variants included in a gene.

As shown in **Wu et. al**, the null distribution of the statistic follows a mixture chi-sqaured distribution described as

where are eigen values of .