Tutorial: RAREMETAL

From Genome Analysis Wiki
Jump to navigationJump to search

Useful Wiki Pages

There are a few pages in this Wiki that may be useful to rareMETAL users. Here are links to a few:

Introduction

In this tutorial, we will use RareMetalWorker and RareMetal to perform a simple rare variant meta analysis. rareMetalWorker is a tool that generates summary statistics that can be shared to enable meta-analysis of gene-level association tests. rareMETAL uses the files generated by RareMetalWorker as input and perform both single variant and gene-level association meta-analysis.

STEP 1: Install Software and Download Example Data Sets

  • If RAREMETAL and RAREMETALWORKER have not been installed on your local computer yet, that is the first step! Installation instructions for RAREMETAL and RAREMETALWORKER are

available (RAREMETAL and RAREMETALWORKER).

  • Then please download the tutorial package, including example data sets and results.
  • In this tutorial, we will use a simple example dataset, which is available here
    • To unpack the example dataset, you can use the following two Unix commands:
 tar xvzf raremetalworker.tutorial.tgz 
 cd rmw_tutorial 
  • Go to your local path where the tar ball was saved then extract
 tar xvzf raremetal.tutorial.tgz #extract
 cd raremetal_tutorial

STEP 2: Run RareMetalWorker on Individual Studies

Example 1

  • The first example has 743 individuals coded as unrelated according to PED file (each person belongs to an individual family).
  • there are ~1000 markers included in the VCF file.
  • To analyze this sample accounting for hidden relatedness, an empirical kinship should be calculated.
  • Go to $yourPath/bin/ and execute the following command:
 $yourPath/bin/raremetalworker  --ped rmw_tutorial/inputfiles/example1.ped 
                                --dat rmw_tutorial/inputfiles/example1.dat 
                                --vcf rmw_tutorial/inputfiles/example1.vcf.gz 
                                --prefix rmw_tutorial/output/example1 
                                --traitName QT1 --inverseNormal --makeResiduals --kinSave --kinGeno

Thefollowing command allows covariates to be adjusted and residuals inverse normalized.

Example 2

  • The second sample can also be analyzed in the same fashion using the following command:
 $yourPath/bin/raremetalworker --ped $yourLocalPath/rmw_tutorial/inputfiles/example2.ped --dat $yourLocalPath/rmw_tutorial/inputfiles/example2.dat --vcf  
       $yourLocalPath/rmw_tutorial/inputfiles/example2.vcf.gz --kinGeno --kinSave --traitName LDL --inverseNormal --makeResiduals --useCovariates 
       --prefix $yourLocalPath/rmw_tutorial/outputfiles/example2
  • After the two runs are finished, you will see the following output files under your current path:
 example1.QT1.singlevar.score.txt
 example1.QT1.singlevar.cov.txt
 example2.QT1.singlevar.score.txt
 example2.QT1.singlevar.cov.txt
  • The output file ending with singlevar.score.txt includes summary statistics of single marker score tests.
  • The output file ending with singlevar.cov.txt includes summary variance-covariance matrices of score statistics.

STEP 3: Run RareMETAL to do Meta Analysis

  • A list of studies to be included is an essential piece of information for RareMETAL to run.
  • First, modify the example.studyname file to make the output files of RareMetaWorker reachable by RareMETAL.
 cd $yourPath/raremetal_tutorial/inputfiles 
  • Open example.studyName and modify them into the following:
 $yourLocalPath/rmw_tutorial/outputfiles/example1.LDL
 $yourLocalPath/rmw_tutorial/outputfiles/example2.LDL
  • If gene-level meta analysis is expected, then annotation information or groups of variants are necessary. RareMETAL can take group file to get this piece of information. * An example group file is in the following:
 $yourLocaPath/raremetal_tutorial/inputfiles/nonsyn.stop.splice.groupfile
  • RareMETAL also takes annotated VCF as input to parse variant grouping information. Please refer to software documentation for details grouping from annotated VCF
  • RareMETAL allows filtering single variants to be included in meta analysis according to their QC information summarized by raremetalworker, including HWE p-value and genotype call rate.
  • Finally, to meta-analyze the above two samples using summary statistics, the following command will generate results from single variant meta analysis, gene-level meta analysis using SKAT, Madsen-Browning burden test, simple burden test, Variable Threshold Burden tests.
 $yourPath/bin/raremetal --studyName --$yourLocaPath/raremetal_tutorial/inputfiles/example.studyname 
   --groupFile $yourLocaPath/raremetal_tutorial/inputfiles/nonsyn.stop.splice.groupfile --SKAT --VT --burden --MB --maf 0.05 --hwe 1.0e-05 --callRate 0.95 
   --prefix $yourLocaPath/raremetal_tutorial/results/
  • To generate a lengthy results and report hits, the following command should be used:
 $yourPath/bin/raremetal --studyName --$yourLocaPath/raremetal_tutorial/inputfiles/example.studyname 
   --groupFile $yourLocaPath/raremetal_tutorial/inputfiles/nonsyn.stop.splice.groupfile --SKAT --VT --burden --MB --maf 0.05 --hwe 1.0e-05 --callRate 0.95 
   --longOutput --tabulateHits --hitsCutoff 1.0e-05 --prefix $yourLocaPath/raremetal_tutorial/results/
  • Please refer to the documentation for detailed description of output format. RareMETAL Results
  • RareMETAL also allows users to output an VCF file of the super set of all variants and use their favorite annotation tool to annotate it and then come back to RareMETAL for the gene-level meta analysis. --writeVCF is the option to use. Please refer to Write VCF and Annotated outside RareMETAL