Tutorial: RAREMETAL
From Genome Analysis Wiki
Introduction
In this tutorial, we will show how to run RareMetalWorker and RareMETAL to perform meta analysis using example data sets.
RareMetalWorker is the pre-processing tool to analyze data from individual studies and generate summary statistics for further meta analysis using RareMETAL.
RareMETAL is the tool to do gene-level meta analysis.
For detailed documentation of both tools, please go to the following:
RareMetalWorker Documentation RareMETAL Documentation
STEP 1: Install Software and Download Example Data Sets
- If RareMETAL and RareMetalWorker have not been installed on your local computer yet, please follow the following directions for installation. Install RareMetalWorker, Install RareMETAL
- Download the example data set for RareMetalWorker to your local drive: RareMetalWorker example data sets
- Go to your local path where the tar ball was saved then extract
tar xvzf raremetalworker.tutorial.tgz #extract cd rmw_tutorial
- Download the example data set for RareMetal to your local drive: RareMetalWorker example data sets
- Go to your local path where the tar ball was saved then extract
tar xvzf raremetal.tutorial.tgz #extract cd raremetal_tutorial
STEP 2: Run RareMetalWorker on Individual Studies
- The first example has 743 individuals coded as unrelated according to PED file (each person belongs to an individual family).
- there are ~1000 markers included in the VCF file.
- To analyze this sample accounting for hidden relatedness, an empirical kinship should be calculated.
- By using the following command, covariates are adjusted and residuals are inverse normalized on the fly.
$yourPath/bin/raremetalworker --ped $yourLocalPath/rmw_tutorial/inputfiles/example1.ped --dat $yourLocalPath/rmw_tutorial/inputfiles/example1.dat --vcf $yourLocalPath/rmw_tutorial/inputfiles/example1.vcf.gz --kinGeno --kinSave --traitName LDL --inverseNormal --makeResiduals --useCovariates --prefix $yourLocalPath/rmw_tutorial/outputfiles/example1
- The second sample can also be analyzed in the same fashion using the following command:
$yourPath/bin/raremetalworker --ped $yourLocalPath/rmw_tutorial/inputfiles/example2.ped --dat $yourLocalPath/rmw_tutorial/inputfiles/example2.dat --vcf $yourLocalPath/rmw_tutorial/inputfiles/example2.vcf.gz --kinGeno --kinSave --traitName LDL --inverseNormal --makeResiduals --useCovariates --prefix $yourLocalPath/rmw_tutorial/outputfiles/example2
- After the two runs are finished, you will see the following output files under your current path:
example1.singlevar.score.txt example1.singlevar.cov.txt example2.singlevar.score.txt example2.singlevar.cov.txt
- The output file ending with singlevar.score.txt includes summary statistics of single marker score tests.
- The output file ending with singlevar.cov.txt includes summary variance-covariance matrices of score statistics.
STEP 3: Run RareMETAL to do Meta Analysis
- A list of studies to be included is an essential piece of information for RareMETAL to run.
- First, modify the example.studyname file to make the output files of RareMetaWorker reachable by RareMETAL.
cd $yourPath/raremetal_tutorial/inputfiles
- Open example.studyName and modify them into the following:
$yourLocalPath/rmw_tutorial/outputfiles/example1.LDL $yourLocalPath/rmw_tutorial/outputfiles/example2.LDL
- If gene-level meta analysis is expected, then annotation information or groups of variants are necessary. RareMETAL can take group file to get this piece of information. * An example group file is in the following:
$yourLocaPath/raremetal_tutorial/inputfiles/nonsyn.stop.splice.groupfile
- RareMETAL also takes annotated VCF as input to parse variant grouping information. Please refer to software documentation for details grouping from annotated VCF
- RareMETAL allows filtering single variants to be included in meta analysis according to their QC information summarized by raremetalworker, including HWE p-value and genotype call rate.
- Finally, to meta-analyze the above two samples using summary statistics, the following command will generate results from single variant meta analysis, gene-level meta analysis using SKAT, Madsen-Browning burden test, simple burden test, Variable Threshold Burden tests.
$yourPath/bin/raremetal --studyName --$yourLocaPath/raremetal_tutorial/inputfiles/example.studyname --groupFile $yourLocaPath/raremetal_tutorial/inputfiles/nonsyn.stop.splice.groupfile --SKAT --VT --burden --MB --maf 0.05 --hwe 1.0e-05 --callRate 0.95 --prefix $yourLocaPath/raremetal_tutorial/results/
- To generate a lengthy results and report hits, the following command should be used:
$yourPath/bin/raremetal --studyName --$yourLocaPath/raremetal_tutorial/inputfiles/example.studyname --groupFile $yourLocaPath/raremetal_tutorial/inputfiles/nonsyn.stop.splice.groupfile --SKAT --VT --burden --MB --maf 0.05 --hwe 1.0e-05 --callRate 0.95 --longOutput --tabulateHits --hitsCutoff 1.0e-05 --prefix $yourLocaPath/raremetal_tutorial/results/
- Please refer to the documentation for detailed description of output format. RareMETAL Results