# RareMETALS

rareMETALS is an R-package for performing single or gene-level tests for detecting rare variant associations. For questions regarding the use of this package, please contact Dajiang Liu (dajiang at umich dot edu) or Gonçalo Abecasis (goncalo at umich dot edu). The same methodology is also implemented in command line tools. Please see [1]

## Change Log

- 04/01/2015 Version 5.9 is released (not a April's fool joke)! A bug in calculating Cochran-Q statistic is fixed. A bug in conditional.rareMETALS.range.group is also fixed. No other analyses are affected.
- 01/24/2015 Version 5.8 is released, which fixed a serious bug for single variant unconditional association tests with group file. If you happen to run the analyses using rareMETALS.single.group() in version 5.7, the results are likely to be incorrect. Please rerun using version 5.8. Please note only rareMETALS.single.group function is affected. All other functions should not be affected by this error.
- 01/04/2015 Version 5.7 is released, which added metrics for heterogeneity of genetic effects, including I2 and Q for single variant association statistics
- 12/09/2014 Version 5.6 is released, which added function conditional.rareMETALS.range.group, and fixed a minor issue for estimating sample sizes.
- 11/19/2014 Version 5.5 is released, which fixes a few bugs on the version 5.4.
- 11/09/2014 Version 5.4 is posted with the following change 1.) Allowing for performing conditional analysis for multiple candidate variants 2.) add option correctFlip to rareMETALS.single.group, rareMETALS.range.group allowing for options to discard sites with non-matching ref or alt alleles. Default is TRUE
- 09/08/2014 Version 5.2 is posted. One change in version 5.0 and 5.1 is reverted, which could lead to undesirable effect. It improves on some border line cases as compared to Versions 4.7 - 4.9. But in general, version 5.2 and 4.7-4.9 should give very comparable results. Please update to the latest version. I would expect that version 5.2 should run stably for all models under all circumstances.
- 08/21/2014 Version 4.9 is posted. A bug is fixed for VT test. While the p-values and statistics were correct, the number of sites and the beta estimate could sometimes be incorrect in version 4.8. Now it is fixed. Please download the newest version. Thanks!
- 08/18/2014 Version 4.8 is posted. A bug for recessive model analysis is fixed. Additive and dominant models should remain unaffected. Thanks!
- 08/06/2014 Version 4.7 is posted, where a few minor bugs were fixed. Thanks to Heather Highland and Xueling Sim for careful testing!! Please update. Thanks!
- 07/15/2014 Fixed a bug in conditional.rareMETALS.single and conditional.rareMETALS.range; Please update. Thanks!
- 06/27/2014 Updated to version 4.0: Many updates are implemented, including support for group files in both single variant and gene-level association test; checks for allele flips based upon variant frequency, the detection of possible allele flips using a novel statistic based upon variations of allele frequency between studies;

## Where to download

The R package can be downloaded from rareMETALS_5.8.tar.gz. It will be eventually released on the Comprehensive R-archive Network. If you want to perform gene-level association test using automatically generated annotations, you will also need refFlat_hg19.txt.gz, which is the gene definition modified from refFlat.

## How to install

To install the package, please use "R CMD INSTALL rareMETALS_XXX.tar.gz" command, where XXX is the version number for rareMETALS

## Supported Functionalities

- Marginal meta-analysis of single variant or gene-level association test
- Conditional analysis of single variant or gene-level association, for variants (gene) where there are covariance information available between candidate variants and known variants.
- Estimates of genetic effects and locus genetic variance
- Estimate measures of genetic effect heterogeneities between studies

## Exemplar Dataset

Four datasets are useful to get you started on how to use rareMETALS R package for meta-analyses of gene-level association test

Media:study1.MetaScore.assoc.gz Media:study2.MetaScore.assoc.gz Media:study1.MetaCov.assoc.gz Media:study2.MetaCov.assoc.gz

## How to Generate Summary Association Statistics and Prepare Them for Meta-analysis

Meta-analysis summary association statistics can be generated by both RVTESTS and RAREMETALWORKER. Please refer to their documentations for generating summary association statistics

Once you have generated summary association statistics, you need to compress them with bgzip, and index them with tabix. If you use RAREMETALWORKER, the command should be like

**NOTE: Tabix 1.X does not seem to support the indexing for generic tab-delimited files. To index the file, please use tabix 0.2.5 or earlier versions. **

If you use RVTESTS, your command should be

bgzip study1.MetaScore.assoc

tabix -s 1 -b 2 -e 2 -S 1 study1.MetaScore.assoc.gz

tabix -s 1 -b 2 -e 2 -S 1 study1.MetaCov.assoc.gz

## A Simple Tutorial for Using the rareMETALS.single function

rareMETALS.single function allow you to perform meta-analyses for single variant association tests. The summary association statistics are combined using Mantel Haenszel test statistic. The details are described in our method paper Liu et al, Nat Genet, 2014.

Assume that you have a set of single variant score statistics and their covariance matrices.

Example:

library(rareMETALS) cov.file <- c("study1.MetaCov.assoc.gz","study2.MetaCov.assoc.gz"); score.stat.file <- c("study1.MetaScore.assoc.gz","study2.MetaScore.assoc.gz") res <- rareMETALS.single(score.stat.file,cov.file=NULL,range="19:11200093-11201275",alternative="two.sided",ix.gold=1,callrate.cutoff=0,hwe.cutoff=0);

## A Simple Tutorial for Using the rareMETALS.range function

res <- rareMETALS.range(score.stat.file,cov.file,range="19:11200093-11201275",range.name="LDLR",test = "GRANVIL",maf.cutoff = 0.05,alternative = c("two.sided"),ix.gold = 1,out.digits = 4,callrate.cutoff = 0,hwe.cutoff = 0,max.VT = NULL) print(res$res.out)

gene.name.out p.value.out statistic.out no.site.out beta1.est.out [1,] "LDLR" "0.6064" "0.2654" "25" "-0.01729" beta1.sd.out maf.cutoff.out direction.burden.by.study.out [1,] "0.03357" "0.05" "--" direction.meta.single.var.out top.singlevar.pos top.singlevar.refalt [1,] "---++-+--+-+++++--+++++-+" "19:11200431" "C/T" top.singlevar.pval top.singlevar.af [1,] "0.004709" "0.01038" pos.ref.alt.out [1,] "19:11200093/T/C,19:11200213/G/A,19:11200235/G/A,19:11200272/C/A,19:11200282/G/A,19:11200309/C/A,19:11200412/C/T,19:11200419/C/T,19:11200431/C/T,19:1120\ 0442/G/A,19:11200475/C/G,19:11200508/G/A,19:11200514/C/T,19:11200557/G/A,19:11200579/C/T,19:11200728/C/T,19:11200753/T/C,19:11200754/G/A,19:11200806/C/T,19:1\ 1200839/T/A,19:11200840/C/A,19:11200896/C/T,19:11201259/G/C,19:11201274/C/T,19:11201275/A/T"

More detailed results can be found in a list res$res.list