Changes

From Genome Analysis Wiki
Jump to navigationJump to search
10,059 bytes added ,  15:11, 29 June 2015
Line 20: Line 20:     
I have created a '''google group''' for discussion on the usage and for bug reports etc. If you find any issues to the package and think that the discussions may also benefit others, please post them to the user group. Here is the link to the discussion group https://groups.google.com/forum/#!forum/raremetals
 
I have created a '''google group''' for discussion on the usage and for bug reports etc. If you find any issues to the package and think that the discussions may also benefit others, please post them to the user group. Here is the link to the discussion group https://groups.google.com/forum/#!forum/raremetals
 +
 +
== Exemplar Datasets ==
 +
 +
The following exemplar datasets [[Media:ExampleDataaset.zip | ExampleDataset.zip]] can be downloaded and tested with rareMETALS2 package.
 +
* ''Score Statistics File''
 +
* ''Covariance Matrix File''
 +
* ''Tabix index file''
 +
These files are all automatically generated by rvtests.
    
== Meta-analysis of Single Variant Associations ==
 
== Meta-analysis of Single Variant Associations ==
   −
   rareMETALS2.single <- function(score.stat.file,range,alternative=c('two.sided','greater','less'),ix.gold=1,callrate.cutoff=0,hwe.cutoff=0,hwe.ctrl.cutoff=0)
+
'''A Simple Tutorial for Using the rareMETALS2.single function'''
  * ''score.stat.file'' files of score statistics                                                                                                           
+
 
  * ''range tabix'' range of variants to be analyzed                                                                                                       
+
'''Example:'''
  * ''alternative'' alternative hypothesis to be specified                                                                                                 
+
 
  * ''ix.gold'' Gold standard population to align reference allele to.                                                                                     
+
Use the ExampleDataset above which is a set of single variant score statistics and their covariance matrices.
   * ''callrate.cutoff'' Cutoffs of call rate, lower than which will NOT be analyzed (labelled as missing)                                                   
+
 
  * ''hwe.cutoff'' Cutoffs of HWE p-values; Variants with HWE p-value smaller than the cutoffs are removed from subsequent analysis and labelled as missing;                                                                                                                                                           
+
cov.file <- c("example1.MetaCov.assoc.gz","example2.MetaCov.assoc.gz")
  * ''hwe.ctrl.cutoff'' Cutoffs of HWE p-values using controls; Variants with HWE p-value smaller than the cutoffs are removed from subsequent analysis and labelled as missing; In case control studies, it is recommended to use hwe.ctrl.cutoff, since large effect variants may violate HWE.
+
score.stat.file <- c("example1.MetaScore.assoc.gz","example2.MetaScore.assoc.gz")
 +
 
 +
library(rareMETALS2)
 +
res <- rareMETALS2.single(score.stat.file,range="19:42906914-45854919",alternative="two.sided",ix.gold=1,callrate.cutoff=0,hwe.cutoff=0)
 +
###result can be explored as below###
 +
> names(res)
 +
  [1] "p.value"            "ref"                "alt"             
 +
  [4] "integratedData"    "raw.data"          "clean.data"       
 +
   [7] "statistic"          "direction.by.study" "anno"             
 +
[10] "af"                "afCase"            "afCtrl"           
 +
[13] "QC.by.study"        "no.sample"          "no.case"         
 +
[16] "no.ctrl"            "beta1.est"          "beta1.sd"         
 +
[19] "hsq.est"            "nearby"            "pos" 
 +
> print(res$pos) ###the first 25 position###
 +
  [1] "19:42906914" "19:42907064" "19:42907171" "19:42909626" "19:42909645"
 +
  [6] "19:42910407" "19:42910445" "19:42910454" "19:42911617" "19:42911769"
 +
  [11] "19:42911869" "19:42911877" "19:42912216" "19:42912223" "19:42912229"
 +
  [16] "19:42912244" "19:42912441" "19:42914568" "19:42914587" "19:42914626"
 +
  [21] "19:42914665" "19:42914668" "19:42914682" "19:42914703" "19:42914772"
 +
> print(res$p.value) ###the P-value of the first 25 position###
 +
  [1]          NA 0.311755625          NA 0.793836855 0.292744822 0.393802545
 +
  [7]          NA 0.175761040 0.106303677          NA 0.217054457          NA
 +
  [13] 0.063036490 0.516061931 0.089290910          NA          NA          NA
 +
  [19]          NA 0.364595966          NA 0.482571300          NA 0.434685495
 +
  [25]          NA
 +
 
 +
'''Function Detail:'''
 +
  rareMETALS2.single <- function(score.stat.file,range,alternative=c('two.sided','greater','less'),ix.gold=1,callrate.cutoff=0,hwe.cutoff=0,hwe.ctrl.cutoff=0)
 +
'''Relevant Parameters''':
 +
* '''score.stat.file''' files of score statistics                                                                                                           
 +
* '''range tabix''' range of variants to be analyzed                                                                                                       
 +
* '''alternative''' alternative hypothesis to be specified                                                                                                 
 +
* '''ix.gold''' Gold standard population to align reference allele to.                                                                                     
 +
* '''callrate.cutoff''' Cutoffs of call rate, lower than which will NOT be analyzed (labelled as missing)                                                 
 +
* '''hwe.cutoff''' Cutoffs of HWE p-values; Variants with HWE p-value smaller than the cutoffs are removed from subsequent analysis and labelled as missing;                                                                                                                                                         
 +
* '''hwe.ctrl.cutoff''' Cutoffs of HWE p-values using controls; Variants with HWE p-value smaller than the cutoffs are removed from subsequent analysis and labelled as missing; In case control studies, it is recommended to use hwe.ctrl.cutoff, since large effect variants may violate HWE.
 +
 
 +
== Meta-analysis of Gene-level Association ==
 +
 
 +
'''A Simple Tutorial for Using the rareMETALS2.range function'''
 +
 
 +
'''Example''':
 +
res1 <- rareMETALS2.range(score.stat.file,cov.file,range="19:42906914-45854919",range.name="LDLR",test = "GRANVIL",maf.cutoff = 0.05,alternative = c("two.sided"),ix.gold = 1,out.digits = 4,callrate.cutoff = 0,hwe.cutoff = 0,max.VT = NULL)
 +
> print(res1$res.out)
 +
    gene.name.out p.value.out statistic.out no.site.out beta1.est.out
 +
[1,] "LDLR"        "0.5575"    "0.344"      "332"      "-0.01899" 
 +
    beta1.sd.out maf.cutoff.out direction.burden.by.study.out
 +
[1,] "0.03237"    "0.05"        "+-"                       
 +
      direction.meta.single.var.out                                                                                                                                                                                                                                                                                                               
 +
[1,]  "++--+++-+--++-++-++---++-++-+----++-+++-++-++--+--++-++----++-++++++++---+++-------+++--++-++---++----++--+--+-+-++--+++++++-++-++--
 +
        --+-+++----+++----++---+---+---++++---+-+--+--+++--+--+-++--++-+-+++---+++++-----++--+-++---+----++-+-++--+-+++-++-+++-+--++---+---
 +
        +++++-+----+++ -+--++---+---++++---+++---+-+++++----+---+-+----+++-++-"
 +
    top.singlevar.pos top.singlevar.refalt top.singlevar.pval top.singlevar.af
 +
[1,] "19:45662295"    "G/A"                "7.804e-15"        "0.02303"     
 +
    pos.ref.alt.out 
 +
[1,] "19:42907064/A/G,19:42909626/C/T,19:42909645/C/T,19:42910407/C/T,19:42910454/C/T,19:42911617/T/C,19:42911869/C/T,19:42912216/C/T,19:42912223/C/T,
 +
      19:42912229/C/T,19:42914626/C/T,19:42914668/G/A,19:42914703/C/T,19:42914880/C/T,19:42930652/C/T,19:42930721/G/A,19:42931022/C/T,19:43013329/G/T,19:43031219/T/C,
 +
      19:43093001/G/A,19:43093011/C/T,19:43093026/T/C,19:43097692/C/T,19:43097788/C/T,19:43243059/C/T,19:43268142/G/T,19:43411959/C/G,19:43411997/G/A,19:43420322/C/T,
 +
      19:43420408/G/C,19:43420415/G/A,19:43420565/C/T,19:43433773/C/T,19:43439628/A/G,
 +
 
 +
'''Function Detail:'''
 +
  rareMETALS2.range <- function(score.stat.file,cov.file,range,range.name,test='GRANVIL',maf.cutoff=1,alternative=c('two.sided','greater','less'),
 +
   ix.gold=1,out.digits=4,callrate.cutoff=0,hwe.cutoff=0,hwe.ctrl.cutoff=0,max.VT=NULL)
 +
 
 +
* '''score.stat.file''' files of score statistics                                                                                                         
 +
* '''cov.file''' covariance matrix files                                                                                                                 
 +
* '''range''' tabix range for each gene/region                                                                                                           
 +
* '''range.name''' The name of the range,e.g. gene names can be used                                                                                     
 +
* '''test''' rare variant tests to be used                                                                                                               
 +
* '''maf.cutoff''' MAF cutoff used to analyze variants                                                                                                   
 +
* '''alternative''' alternative hypothesis to be specified                                                                                               
 +
* '''ix.gold''' Gold standard population to align reference allele to                                                                                     
 +
* '''out.digits''' Number of digits used in the output                                                                                                   
 +
* '''callrate.cutoff''' Cutoffs of call rate, lower than which will NOT be analyzed (labelled as missing)                                                   
 +
* '''hwe.cutoff''' Cutoffs of HWE p-values; Variants with HWE p-value smaller than the cutoffs are removed from subsequent analysis and labelled as missing;                                                                                                                                                           
 +
* '''hwe.ctrl.cutoff''' Cutoffs of HWE p-values using controls; Variants with HWE p-value smaller than the cutoffs are removed from subsequent analysis and labelled as missing; In case control studies, it is recommended to use hwe.ctrl.cutoff, since large effect variants may violate HWE.
 +
* '''max.VT''' The maximum number of thresholds used in VT; Setting max.VT to 10 can improve the speed for calculation without affecting the power too much. The default parameter is NULL, which does not set upper limit on the number of variable frequency threhsold.
 +
 
 +
== Conditional Meta-analysis ==
 +
 
 +
'''A Simple Tutorial for Using the conditional.rareMETALS2.single'''
 +
 
 +
'''Example'''
 +
 
 +
  res2<-conditional.rareMETALS2.single(candidate.variant.vec=c("19:42906914","19:45854819"), score.stat.file, cov.file,known.variant.vec=c("19:43995275","19:44047839","19:44084155"), maf.cutoff=0.05, no.boot =1000,
 +
                            alternative = c("two.sided"), ix.gold = 1,out.digits = 4, callrate.cutoff = 0, hwe.cutoff = 0,p.value.known.variant.vec = NA, anno.known.variant.vec = NA,anno.candidate.variant.vec = NA)
 +
  > print(res2$res.out)
 +
    POS          REF ALT N:N_CASE:N_CTRL  PVALUE AF:AF_CASE:AF_CTRL BETA_EST
 +
[1,] "19:42906914" "G" "T" "4012:2008:2004" "NaN"  "0:0:0"            "NaN" 
 +
[2,] ""            ""  ""  "4012:2008:2004" NA    "0:NA:NA"          NA     
 +
    BETA_SD DIRECTION_BY_STUDY ANNO
 +
[1,] "Inf"  "=="              "N/A"
 +
[2,] NA      NA                "" 
 +
    POS_REF_ALT_ANNO_KNOWN                             
 +
[1,] "0"                                                 
 +
[2,] "19:43995275/NA/NA,19:44047839/C/T,19:44084155/NA/NA"
 +
 
 +
 
 +
  '''Function Detail:'''
 +
                                                                                                                             
 +
  conditional.rareMETALS2.single <- function(candidate.variant.vec,score.stat.file,cov.file,known.variant.vec,maf.cutoff,no.boot=10000,alternative=c('two.sided','greater','less'),
 +
  ix.gold=1,out.digits=4,callrate.cutoff=0,hwe.cutoff=0,hwe.ctrl.cutoff=0,p.value.known.variant.vec=NA,anno.known.variant.vec=NA,anno.candidate.variant.vec=NA)
 +
 
 +
* '''candidate.variant''' Position of candidate variant 
 +
* '''score.stat.file''' Files of score statistics                                                                                                         
 +
* '''cov.file''' Covariance matrix files                                                                                                                 
 +
* '''known.variant.vec''' Range of candidate variant, expressed in a vector, e.g. c("1:12345","1:234567");                                               
 +
* '''test''' test of rare variant tests                                                                                                                   
 +
* '''maf.cutoff''' Cutoffs of MAF used for determining rare variants                                                                                     
 +
* '''alternative''' Alternative hypothesis to be tested                                                                                                   
 +
* '''out.digits''' The number of digits used in the output                                                                                               
 +
* '''callrate.cutoff''' Cutoff of call rates. Sites with callrates lower than the cutoff will be labeled as missing                                       
 +
* '''hwe.cutoff''' Cutoff of HWE p-values. Sites with HWE pvalues lower than the cutoff will be labeled as missing
98

edits

Navigation menu