Difference between revisions of "Tutorial: RAREMETAL"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 90: Line 90:
  
 
* To do conditional analysis, we just have to add --condition conditionfile, where conditionfile contains the variant that we want to condition upon in the following format:  
 
* To do conditional analysis, we just have to add --condition conditionfile, where conditionfile contains the variant that we want to condition upon in the following format:  
 +
 
   9:505484545:C:T
 
   9:505484545:C:T
 +
 
   Then extra columns of conditional analysis results will be saved in the output file "myresult/tutorial.example.QT1.meta.singlevar.results".
 
   Then extra columns of conditional analysis results will be saved in the output file "myresult/tutorial.example.QT1.meta.singlevar.results".
  

Revision as of 01:12, 16 January 2014

Useful Wiki Pages

There are a few pages in this Wiki that may be useful to RAREMETAL users. Here are links to a few:

Introduction

In this tutorial, we will use RAREMETAL to perform single variant and gene-level meta-analysis using summary statistics of two example studies generated by RAREMETALWORKER.

STEP 1: Install Software and Download Example Data Sets

  • If RAREMETAL and RAREMETALWORKER have not been installed on your local computer yet, please find software package together with installation instructions for RAREMETAL and RAREMETALWORKER.
  • Then please download the tutorial package, including example data sets and results.
    • To unpack the example dataset, you can use the following two Unix commands:
 tar xvzf raremetal_tutorial.tar.gz 
 cd raremetal_tutorial

STEP 2: Analyze individual samples using RAREMETALWORKER

  • The first example has 743 individuals coded as unrelated according to PED file (each person belongs to an individual family).
  • there are ~1000 markers included in the VCF file.
  • To analyze this sample accounting for sample relatedness, an empirical kinship should be calculated.
  • Go to $yourPath/bin/ and execute the following command:
 cd raremetalworker/inputfiles
 $yourpathforRAREMETALWORKER/bin/raremetalworker  --ped example1.ped --dat example1.dat --vcf example1.vcf.gz
                                --traitName QT1 --inverseNormal --makeResiduals --kinSave --kinGeno --prefix ../yourprefix.example1 --labelHits
  • The command above allows you to estimate relatedness according to common variants genotypes with good variant calling quality, adjust for covariates, quantile normalize the residuals before further association analysis of trait QT1. The following output are generated:
raremetalworker/yourprefix.example1.QT1.singlevar.score.txt (## summary statistics, single variant association results, QC statistics, Genomic control, etc.)
raremetalworker/yourprefix.example1.QT1.singlevar.cov.txt (## covariance matrices of score statistics)
raremetalworker/yourprefix.example1.plots.pdf (## QQ plots and manhattan plots)
raremetalworker/yourprefix.example1.Empirical.Kinship.gz (##contains empirical kinship of all individuals who have at least one site genotyped, with id on top row.)
raremetalworker/yourprefix.example1.singlevar.log
  • Following the same strategy, example2 can also be analyzed using a similar command:
  $yourpathforRAREMETALWORKER/bin/raremetalworker  --ped example2.ped --dat example2.dat --vcf example2.vcf.gz
                                --traitName QT1 --inverseNormal --makeResiduals --kinSave --kinGeno --prefix yourprefix.example2 --labelHits

STEP 3: Run RAREMETAL for Meta-Analysis

  • In this step, we run RAREMETAL to meta-analyze two studies without using any of raw data.
  • Prepare the raremetalworker results for meta-analysis using the following command:
 bgzip yourprefix.example1.QT1.singlevar.score.txt
 tabix -c "#" -s 1 -b 2 -e 2 yourprefix.example1.QT1.singlevar.score.txt.gz
 bgzip yourprefix.example1.QT1.singlevar.cov.txt
 tabix -c "#" -s 1 -b 2 -e 2 yourprefix.example1.QT1.singlevar.cov.txt.gz
  • Before doing analysis, open raremetal/summaryfiles and modify the prefix with the right path. The file should look like
 $yourpath/raremetal_tutorial/raremetalworker/output/example1.QT1.singlevar.score.txt.gz
 $yourpath/raremetal_tutorial/raremetalworker/output/example1.QT2.singlevar.score.txt.gz
  • Then open raremetal/covfiles and modify the prefix with the right path. The file should look like
 $yourpath/raremetal_tutorial/raremetalworker/output/example1.QT1.singlevar.cov.txt.gz
 $yourpath/raremetal_tutorial/raremetalworker/output/example1.QT2.singlevar.cov.txt.gz
  • Now, we are ready for meta-analysis. To perform single variant and four type of gene-level meta-analysis all at once, use the following command line:
 cd raremetal
 $yourRAREMTALpath/bin/raremetal --summaryFiles summaryfiles --covFiles covfiles --groupFile group.file --SKAT --burden --MB --VT 
                                 --hwe 1.0e-05 --callRate 0.95 (# QC options)
                                 --longOutput --tabulateHits --hitsCutoff 1e-05 --prefix myresult/tutorial.example.QT1 (#output options) 
                                 --labelHits
  • The following output are generated
 myresult/tutorial.example.QT1.meta.plots.pdf (## QQ plots and manhattan plots from both single variant and gene-level meta-analysis with hits labeled)
 myresult/tutorial.example.QT1.meta.singlevar.results 
 myresult/tutorial.example.QT1.meta.burden.results
 myresult/tutorial.example.QT1.meta.SKAT.results
 myresult/tutorial.example.QT1.meta.VT.results
 myresult/tutorial.example.QT1.meta.MB.results
 myresult/tutorial.example.QT1.meta.tophits.SKAT.tbl
 myresult/tutorial.example.QT1.meta.tophits.VT.tbl
 myresult/tutorial.example.QT1.meta.tophits.burden.tbl
 myresult/tutorial.example.QT1.meta.tophits.MB.tbl
 myresult/tutorial.example.QT1.raremetal.log
  • To do conditional analysis, we just have to add --condition conditionfile, where conditionfile contains the variant that we want to condition upon in the following format:
 9:505484545:C:T
 Then extra columns of conditional analysis results will be saved in the output file "myresult/tutorial.example.QT1.meta.singlevar.results".
  • Please refer to the RAREMETAL documentation for detailed description of output format.
  • RAREMETAL also takes annotated VCF as input to parse variant grouping information. Please refer to software documentation for details.
  • RAREMETAL allows users to output an VCF file of the super set of all variants and use their favorite annotation tool to annotate it and then come back to RAREMETAL for the gene-level meta analysis. --writeVCF is the option to use.