Rare variant tests

From Genome Analysis Wiki
Jump to: navigation, search

Summary of discussion from ESP rare variant working group

The rare variant working group within ESP has discussed the issue of rare variant tests on several conference calls. The end result is that we recommend selecting one test from each of these three categories;

1. Aggregate tests (typically with 1% threshold, nonsynonymous SNPs only, with meta-analysis across different ethnic groups)

2. Tests that allow for risk and/or protective variants (again, probably 1% threshold, nonsynonymous SNPs only, with meta-analysis across different ethnic groups)

3. Weighted tests that allow incorporation of more common variants (possibly apply 5% threshold?, nonsynonymous only, etc.)

A brief summary of the RV discussion;

- Permutations (where we permute phenotype while maintaining ethnic group) will likely be required to get empirical p-values. These RV tests typically provide conservative p-values (deflated QQ plot), but not always. Thus, a computationally intensive test will not be practical for performing large numbers of permutations (at least 1000).

- Using too many tests will decrease the power overall because of correction for family-wise error.

- Although we'd like to evaluate power and type I error rates of these tests under a variety of genetic models, the reality is that we have so few known positive examples it would be difficult to assess them all in a fair way at this time. Instead, we expect to re-convene this discussion group at a later date once some true positive associations are identified.

- Shamil Sunyaev is performing a bake-off with some of these tests, and we look forward to seeing his results in the future.

- PLINKSeq is on its way, but is likely a month away from release (end Feb 2011)


Summary of rare variant tests for sequence data

Compiled by Cristen Willer and Suzanne Leal for the ESP Feb 1, 2011

* indicates applicability to quantitative data



1) Aggregate tests using a cut off e.g. 1 % analyzing nonsynonymous variants to detect detrimental variants

Test Name Reference Software Notes
CMC/T1 test* Li & Leal, 2008 Will be implemented in PlinkSeq
KBAC Liu & Leal, 2010 Will be implemented in PlinkSeq
VT* Price et al., 2010 http://genetics.bwh.harvard.edu/rare_variants/
WSS Madsen & Browning, 2009 with 1% cutoff, Will be implemented in PlinkSeq
CMAT Zawistowski et al. 2010
ANRV/GRANVIL* Morris & Zeggini
RARECOVER Bhati et al. 2010
CCRaVAT and QuTie* Lawrence et al. 2010 http://www.sanger.ac.uk/resources/software/rarevariant/
RVE (rare variant exclusive) Cohen & Hobbs underpowered, Will be implemented in PlinkSeq


2) Aggregate tests for protective and detrimental variants (recommend 1% cutoff)

Test Name Reference Software Notes
C-alpha [Neale et al., submitted] Will be implemented in PlinkSeq
Ionita-Laza & Lange Ionita-Laza & Lange, 2011
DASH* Han & Pan Computational burden
SKAT* Wu et al., 2010 http://www.hsph.harvard.edu/~xlin/software.html For some kernel choices, need to code 0=major homozygote, 1=het, 2-minor homozygote
WHaIT Li et al. 2010 http://csg.sph.umich.edu//yli/whait/
EMMPAT* King et al. 2010 http://home.uchicago.edu/~crk8e/papersup.html


3) Analyzing common and rare variants together (could down-weight or threshold common variants)

Test Name Reference Software Notes
WSS Madsen & Browning, 2009 with 1% or 5% cutoff, Will be implemented in PlinkSeq
RARECOVER Bhati et al. 2010
Step-Up Collapsing* Hoffman et al. 2010 Will be implemented in PlinkSeq
CMC/T5 test* Li & Leal, 2008 Will be implemented in PlinkSeq
MENDEL* Zhou et al. 2011 http://www.genetics.ucla.edu/software/download?package=1


4.) Analyze higher frequency rare variants >1% individually

                  Use same regression frame work which has been used for common variants*
                  Use meta analysis to combine results from sequence data and imputed genotypes to increase power*

Additional tests

Test Name Reference Software Notes
Logic regression* Kooperberg et al. 2001
Sequence diversity Anderson et al. 2006
Sequence dissimilarity* Schork et al. 2008, Wessel et al. 2006
Ridge regression * Malo et al. 2008