Changes

From Genome Analysis Wiki
Jump to: navigation, search

RareMETALS

6,436 bytes added, 10:18, 18 May 2017
Where to download
rareMETALS is an R-package for performing single or gene-level tests for detecting rare variant associations. For questions regarding the use of this package, please contact Dajiang Liu (dajiang .liu at umich outlook dot educom) or Gonçalo Abecasis (goncalo at umich dot edu). The same methodology is also implemented in command line tools. Please see [http://genome.sph.umich.edu/wiki/Rare-Metal]
== Change Log ==
* 05/18/2017 Version 6.8 incorporates a number of new features and bug fixes. We included support for multi-allelic variants, the support for a new conditional analysis method, the support for cohort level genomic controls, and the bug fixes for calculating heterogeneity statistics such as Q and I2.
* 04/09/2016 Version 6.3 is released. Minor bug fix: Due to different level of missingness of variants in the gene, the single variant association statistics calculated using the covariance matrices of score statistics can be different than single variant association statistics calculated using vstat. This has lead to confusions. It has been fixed in version 6.3. The primary results from version 6.2 should be correct.
* 09/25/2015 Version 6.2 is released. Minor bug fix: Removed the incorrect warning information in version 6.1 when quantitative traits are meta-analyzed. The software incorrectly consider it as binary trait and suggested the use of rareMETALS2.
* 07/23/2015 Version 6.1 is released. Minor feature changes include output for VT the sites where the statistics are maximized; fixed a bug for determining monomorphic sites. Issue warnings when rareMETALS is used to analyze binary trait for meta-analysis.
* 05/19/2015 Version 6.0 is released. Minor feature addition: rareMETALS can now output of the set of variants that are analyzed in VT (i.e. the set of variants with MAF < the threshold where the VT statistic is maximized).
* 04/01/2015 Version 5.9 is released (not a April's fool joke)! A bug in calculating Cochran-Q statistic is fixed. A bug in conditional.rareMETALS.range.group is also fixed. No other analyses are affected.
* 01/24/2015 Version 5.8 is released, which fixed a serious bug for single variant unconditional association tests with group file. If you happen to run the analyses using rareMETALS.single.group() in version 5.7, the results are likely to be incorrect. Please rerun using version 5.8. Please note only rareMETALS.single.group function is affected. All other functions should not be affected by this error.
== Where to download ==
The R package can be downloaded from [[Media:rareMETALS_5rareMETALS_6.98.tar.gz | rareMETALS_5rareMETALS_6.98.tar.gz]]. It will be eventually released on the Comprehensive R-archive Network. If you want to perform gene-level association test using automatically generated annotations, you will also need [[Media:refFlat_hg19.txt.gz | refFlat_hg19.txt.gz]], which is the gene definition modified from refFlat.
== Documentation ==
Example:
library(rareMETALS)
cov.file <- c("study1.MetaCov.assoc.gz","study2.MetaCov.assoc.gz")
score.stat.file <- c("study1.MetaScore.assoc.gz","study2.MetaScore.assoc.gz")
 
library(rareMETALS)
res <- rareMETALS.single(score.stat.file,cov.file=NULL,range="19:11200093-11201275",alternative="two.sided",ix.gold=1,callrate.cutoff=0,hwe.cutoff=0)
###result can be explored as below###
> names(res)
[1] "p.value" "ref" "alt" "integratedData" "raw.data"
[6] "clean.data" "statistic" "direction.by.study" "anno" "maf"
[11] "QC.by.study" "no.sample" "beta1.est" "beta1.sd" "hsq.est"
[16] "nearby" "pos"
> print(res$pos)
[1] "19:11200093" "19:11200213" "19:11200235" "19:11200272" "19:11200282" "19:11200309" "19:11200412" "19:11200419"
[9] "19:11200431" "19:11200442" "19:11200475" "19:11200508" "19:11200514" "19:11200557" "19:11200579" "19:11200728"
[17] "19:11200753" "19:11200754" "19:11200806" "19:11200839" "19:11200840" "19:11200896" "19:11201124" "19:11201259"
[25] "19:11201274" "19:11201275"
> print(res$p.value)
[1] 0.551263675 0.056308558 0.172481571 0.734935815 0.922326732 0.053804524 0.886985353 0.903835162 0.005280228 0.266575301
[11] 0.196122312 0.157114376 0.951477852 0.840523624 0.759482777 0.112743041 0.414147263 0.825877149 0.006090142 0.096474975
[21] 0.096474975 0.956407850 0.038234190 0.253512486 0.550935361 0.482315038
 
== A Simple Tutorial for Using the rareMETALS.single.group function ==
 
Dataset used to get the refaltList [[Media:groupFile.txt.gz]]
 
res.site<-read.table("groupFile.txt",header=T)
refaltList <- list(pos=paste(res.site[,1],res.site[,2],sep=":"),ref=res.site$AF,alt=res.site$ALT,af=res.site$AF,af.diff.max=0.5,checkAF=T)
res31<-rareMETALS.single.group(score.stat.file,cov.file=NULL, range="19:11200093-11201275", refaltList,
alternative = c("two.sided"), callrate.cutoff = 0,
hwe.cutoff = 0, correctFlip = TRUE, analyzeRefAltListOnly = TRUE)
 
###result can be explored as below###
> names(res31)
[1] "p.value" "ref" "alt" "integratedData" "raw.data"
[6] "clean.data" "statistic" "direction.by.study" "anno" "maf"
[11] "maf.byStudy" "maf.maxdiff.vec" "ix.maf.maxdiff.vec" "maf.sd.vec" "no.sample.mat"
[16] "no.sample" "beta1.est" "beta1.sd" "QC.by.study" "hsq.est"
[21] "nearby" "cochranQ.stat" "cochranQ.df" "cochranQ.pVal" "I2"
[26] "log.mat" "pos"
> print(res31$pos)
[1] "19:11200093" "19:11200213" "19:11200235" "19:11200272" "19:11200282" "19:11200309" "19:11200412" "19:11200419"
[9] "19:11200431" "19:11200442" "19:11200475" "19:11200508" "19:11200514" "19:11200557" "19:11200579" "19:11200728"
[17] "19:11200753" "19:11200754" "19:11200806" "19:11200839" "19:11200840" "19:11200896" "19:11201124" "19:11201259"
[25] "19:11201274" "19:11201275"
> print(res31$p.value)
[1] NA NA NA NA 0.9223267 NA NA NA NA NA NA NA
[13] NA NA NA NA NA NA NA NA NA NA NA NA
[25] NA NA
== A Simple Tutorial for Using the rareMETALS.range function ==
More detailed results can be found in a list res$res.list
 
== A Simple Tutorial for Using the rareMETALS.range.group function ==
 
res32<-rareMETALS.range.group(score.stat.file, cov.file, range="19:11200093-11201275", range.name="LDLR",
test = "GRANVIL", refaltList, maf.cutoff = 1,
alternative = c("two.sided"), out.digits = 4,
callrate.cutoff = 0, hwe.cutoff = 0, max.VT = NULL,
correctFlip = TRUE, analyzeRefAltListOnly = TRUE)
print(res32$res.out)
 
gene.name.out N.out p.value.out statistic.out no.site.out beta1.est.out beta1.sd.out maf.cutoff.out
[1,] "LDLR" "2504" "0.8629" "0.0298" "1" "0.1764" "1.044" "1"
direction.burden.by.study.out direction.meta.single.var.out top.singlevar.pos top.singlevar.refalt top.singlevar.pval
[1,] "+-" "+" "19:11200282" "3/1" "0.8629"
top.singlevar.af pos.ref.alt.out
[1,] "0.000599" "19:11200282/G/A"
== A Simple Tutorial for Using the conditional.rareMETALS.single==
Conditional analysesIt is well known that, which owing to linkage disequilibrium, one or more common causal variants can assess whether rare-variant associations are affected by result in shadow association signals at other nearby significant common or rare variants, use RareMETALS to perform conditional analysis for single variant tests
example:
res<-conditional.rareMETALS.single(candidate.variant.vec=c("19:11200282","19:11200309"), score.stat.file, cov.file,
known.variant.vec=c("19:11200754","19:11200806","19:11200839"), maf.cutoff=0.05, no.boot =1000,
alternative = c("two.sided", "greater", "less"), ix.gold = 1,
out.digits = 4, callrate.cutoff = 0, hwe.cutoff = 0,
p.value.known.variant.vec = NA, anno.known.variant.vec = NA,
POS REF ALT PVALUE AF BETA_EST BETA_SD DIRECTION_BY_STUDY ANNO POS_REF_ALT_ANNO_KNOWN
[1,] "19:11200282" "G" "A" "0.5825" "0.000599" "0.5616" "1.044" "-=" "N/A" "19:11200754/G/A/NA,19:11200806/C/T/NA,19:11200839/T/A/NA" [2,] "19:11200309" "C" "A" "0.01484" "0.01538" "-0.3615" "0.02201" "+=" "N/A" "19:11200754/G/A/NA,19:11200806/C/T/NA,19:11200839/T/A/NA"
res<-conditional.rareMETALS.range(range.name = "LDLR", score.stat.file, cov.file,
candidate.variant.vec=c("119:12311200282","119:123411200309"), known.variant.vec=c("119:11200754","19:12311200806","119:123411200839"), test = "GRANVIL", maf.cutoff=0.05, alternative = c("two.sided", "greater", "less"), ix.gold = 1,
out.digits = 4, callrate.cutoff = 0, hwe.cutoff = 0, max.VT = NULL)
print(res$res.out)  gene.name.out p.value.out statistic.out no.site.out beta1.est.out beta1.sd.out maf.cutoff.out direction.burden.by.study.out direction.meta.single.var.out [1,] "LDLR" "0.01961" "5.446" "2" "-0.3429" "0.1469" "0.05" "-?" "+-" top.singlevar.pos top.singlevar.refalt top.singlevar.pval top.singlevar.af pos.ref.alt.out pos.ref.alt.known.out [1,] "19:11200309" "C/A" "0.01484" "0.01538" "19:11200282/G/A,19:11200309/C/A" "19:11200754/G/A,19:11200806/C/T,19:11200839/T/A" More detailed results can be found in a list res$res.list
131
edits

Navigation menu