Changes

From Genome Analysis Wiki
Jump to: navigation, search

RAREMETAL Documentation

2,794 bytes removed, 13:22, 20 May 2019
Update contact address
[[Category:RAREMETAL]]
== Useful Wiki Pages ==
== Useful Wiki Pages ==* Git hub page: https://github.com/statgen/Raremetal
There are several pages in this Wiki that may be useful to RAREMETAL users. Here are links to key pages:* The [[RAREMETAL_Change_Log | Change Log]]
* The [[RAREMETALRAREMETAL_DOWNLOAD_%26_BUILD |RAREMETAL Home PageDOWNLOAD page]]
* The [[Tutorial:_RAREMETAL|RAREMETAL Quick Start Tutorial]]
* The [[RAREMETAL METHOD]]
* The [[RAREMETAL FAQ]]
The [http://genome.sph.umich.edu/wiki/Rvtests '''rvtests'''] tool for rare-variant association analysis can also generate output compatible with RAREMETAL.
 
== Key Features ==
'''RAREMETAL''' has the following features:
* '''RAREMETAL''' performs gene-based or region-based meta analysis using Burden tests with the following methods: CMC_counts, Madsen-Browning, SKAT, and Variable Threshold.
* '''RAREMETAL''' performs single variant metal-analysis by default.
* '''RAREMETAL''' allows customized groups of variants to be tested.
* '''RAREMETAL''' allows conditional analysis to be performed in both gene-level meta-analysis and single variants meta-analysis.
* '''RAREMETAL''' generate QQ plots and manhattan plots by default.
== Brief Description ==
'''RAREMETAL''' is a computationally efficient tool for meta-analysis of rare variants using sequencing or genotyping array data. '''RAREMETAL''' It takes summary statistics and LD matrices generated by [[Rare-Metal-Worker|'''RAREMETALWORKER''']] or [http://genome.sph.umich.edu/wiki/Rvtests '''rvtests'''], handles related and unrelated individuals, and supports both single variant and burden meta-analysis. '''RAREMETAL''' It generates high quality plots by default and has options that allow users to build reports at different levels.
'''RAREMETAL''' is developed by Shuang Feng, Dajiang Liu and Gonçalo Abecasis. A R-package written by Dajiang Liu using the same methodology is [[RareMetals|'''available''']].
 
== Key Features ==
'''RAREMETAL''' has the following features:
* Performs gene-based or region-based meta analysis using Burden tests with the following methods: CMC_counts, Madsen-Browning, SKAT, and Variable Threshold.
* Performs single variant metal-analysis by default.
* Allows customized groups of variants to be tested.
* Allows conditional analysis to be performed in both gene-level meta-analysis and single variants meta-analysis.
* Generate QQ plots and manhattan plots by default.
== Approach ==
The key idea behind meta-analysis with RAREMETAL is that various gene-level test statistics can be reconstructed from single variant score statistics and that, when the linkage disequilibrium relationships between variants are known, the distribution of these gene-level statistics can be derived and used to evaluate signifi-cance. Single variant statistics are calculated using the Cochran-Mantel-Haenszel method. The main formulae are tabulated Our method has been published in the following[http{| border="1" cellpadding="5" cellspacing="0" align="center"|+'''Formulae for RAREMETAL'''! scope="col" width="120pt" | Test! scope="col" width="50pt" | Statistics! scope="col" width="225pt" | Null Distribution! scope="col" width="225pt" | Notation|-| Single Variant || <math>T=\sum_{i=1}^n {U_i}\bigg/\sqrt{\sum_{i=1}^n{V_i}}</math> || <math>T\sim\mathbf{N}(0,1)<www.nature.com/math> ||<math> U_i \text{ is the score statistic from study }i;</math><math> V_i \text{ is the variance of } U_i.<ng/math>|-| un-weighted Burden || <math>T_b=\sum_{i=1}^n{\mathbf{U_i}}\Bigjournal/\sqrt{\sum_{i=1}^n{\mathbf{V_i}}}<v46/math> || <math>T_b\sim\mathbf{N}(0,1)<n2/math> ||<math> \mathbf{U_i}\text{ is the vector of score statistics from study }i, or <abs/math> <math> \mathbf{U_i}=\{U_{i1},ng.2852.html '''Liu et.,U_{im}\};</math> <math>\mathbf{V_i} \text{ is the covariance of } \mathbf{U_i}al'''] in Nature Genetics.<Please go to [http:/math>|-| Weighted Burden || <math>T_{wb}=\mathbf{w^T}\sum_{i=1}^n{\mathbf{U_i}}\bigg/\sqrt{\mathbf{w^T}\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\mathbf{w}}</math> || <math>T_{wb}\sim\mathbf{N}(0,1)</math> || <math> \mathbf{w^T}=\{w_1,w_2,genome.sph.umich.,w_m\}^T \text{ is the weight vector.}<edu/math>|-style="height: 50pt;"| VT || <math>T_{VT}=\max(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)}),\text{ where}<wiki/math><math>T_{b\left(f_j\right)}=\boldsymbol{\phi}_{f_j}^\mathbf{T}\sum_{i=1}^n{\mathbf{U_i}}\bigg/\sqrt{\boldsymbol{\phi}_{f_j}^\mathbf{T}\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\boldsymbol{\phi}_{f_j}} </math> ||<math> \left(T_{b\left(f_1\right)},T_{b\left(f_2\right)},\dots,T_{b\left(f_m\right)}\right)</math><math>\sim\mathbf{MVN}\left(\mathbf{0},\boldsymbol{\Omega}\right)\text{,} </math><math>\text{where }\boldsymbol{\Omega_{ij}}=\frac{\boldsymbol{\phi}_{f_i}^T\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\boldsymbol{\phi}_{f_j}}{\sqrt{\boldsymbol{\phi}_{f_i}^T\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\boldsymbol{\phi}_{f_i}}\sqrt{\boldsymbol{\phi}_{f_j}^T\left(\sum_{i=1}^n{\mathbf{V_i}}\right)\boldsymbol{\phi}_{f_j}}}</math> || <math> \boldsymbol{\phi}_{f_j}\text{ is a vector of } 0 \text{s and } 1\text{s,} </math> <math>\text{indicating the inclusion of a variant using threshold }f_j; </math> |-| SKAT || <math>\mathbf{Q}=\left(\sum_{i=1}^n{\mathbf{U_i^T}}\right) \mathbf{W}\left(\sum_{i=1}^n{\mathbf{U_i}}\right)</math> ||<math>\mathbf{Q}\sim\sum_{i=1}^m{\lambda_i\chi_{1,i}^2},\text{ where}</math> <math>\left(\lambda_1,\lambda_2,\dots,\lambda_m\right)\text{ are eigen values of}</math><math>\left(\sum_{i=1}^n{\mathbf{V_i}}\right)^\frac{1}{2}\mathbf{W}\left(\sum_{i=1}^n{\mathbf{V_i}}\right)^\frac{1}{2}</math> || <math>\mathbf{W}\text{ is a diagonal matrix of weightsRAREMETAL_method '''method'''] for details.}</math>|}
== Download and Installation ==
* University of Michigan CSG users can go to the following:
/net/fantasia/home/sfengsph/code/Rare-Metal/raremetal/bin/raremetal
 
=== Where to Download ===
* The software package for Linux and Mac (source code included) can be downloaded here: [[Media:Raremetal.0.4.4.tgz ‎|'''RAREMETAL DOWNLOAD''']]
* Binary or executable can be downloaded here: [[Media:RAREMETAL.0.4.4_binary.tgz ‎|'''RAREMETAL BINARY DOWNLOAD''']]
 
=== How to Compile ===
* Save it to your local path and decompress using the following command:
tar xvzf raremetal.0.4.4.tar.gz
* Go to raremetal_0.4.4/raremetal/src and type the following command to compile:
make
We have tested compilation using our source code on several platforms including Linux, and Mac OS X.
For compiling questionssource code and executables together with instructions of building from source, please go to [http://genome.sph.umich.edu/wiki/COMPILE_RAREMETAL_Q[RAREMETAL_DOWNLOAD_%26A#A.3 26_BUILD |'''compiling DOWNLOAD source and debugging Q & Aexecutables'''] for more information].
=== How to Execute ===* Go to raremetal_0.4.4/raremetal/bin and use the following: ./raremetal* For example usagequestions about compilation, please refer go to [[http://genome.sph.umich.edu/wiki/Rare-Metal#Example_Usage example command linesRAREMETAL_FAQ | '''FAQ''']].
== Basic Usage Instructions ==
=====Summary Statistics=====
Files containing summary statistics and LD matrices generated by '''RAREMETALWORKER''' should be compressed and [http://samtools.sourceforge.net/tabix.shtml '''tabix'''] indexed using the following commands(Note in RAREMETALWORKER, if --zip is specified, these .gz and .tbi files will be automatically generated):
bgzip study1.singlevar.score.txt
* The above example study name file guides '''RAREMETAL''' to look for summary statistics from TwinsUK study only, because "HUNT" study is commented out. The following two files are needed for '''RAREMETAL''' to perform further analysis together with their tabix index file are needed.
 
* Please sepcify --dosage option if input files were generated from dosage instead of genotype.
=====Group Rare Variants=====
==== Association Options====
* Currently, CMC type burden test, Madsen-Browning burden test, Variable Threshold burden test and SKAT are provided in '''RAREMETAL''', by specifying --burden, --MB, --VT and --SKAT.
* --maf specifies the minor allele frequency cutoff when doing gene-based or group-based burden tests. Variants with maf '''above''' this threshold will be ignored. The default is maf<0.05.* In '''a single study''' of sample size N, if a site is monomorphic or not reported in vcf/ped, it is considered that the sample size of this study is not large enough to sample the rare allele. Thus, this study contributes 2*N reference alleles and 0 alternative allele towards meta-analysis. To let such studies contribute no alleles towards pooled allele frequency, specify --altMAF.
==== Conditional Analysis====
== Additional Analysis Options ==
 
=== Generate a VCF File to Annotate Outside RAREMETAL ===
* --writeVCF allows user to write a VCF file including pooled single variants from all studies. Then users can use their favorite annotation tool to annotate the VCF file. After annotating the VCF file, users can use that file as input for '''RAREMETAL''' for further gene-based or region-based meta analysis.
* The output vcf file will be name as: yourPrefix.pooled.variants.vcf. An example output vcf file is in the following:
#CHROM POS ID REF ALT QUAL FILTER INFO
1 115658497 115658497 G A . . ALT_AF=0.380906;
2 74688884 74688884 G A . . ALT_AF=8.33611e-05;
3 121414217 121414217 C A . . ALT_AF=0.0747833;
===Annotation===
* RAREMETAL automatically recognizes the annotation format generated by [[TabAnno | '''ANNO''']] or [[EPACTS#Annotating_VCF_file_using_EPACTS | '''EPACTS''']].
* To annotate a the VCF generated in previous step, you can use the following command:
./anno --in your.in.vcf.gz --out your.out.vcf.gz
=== Group Rare Variants from Annotated VCF ===
* The annotated VCF file should be specified using --annotatedVcf option.
* --annotation should be used with --annotatedVcf together when specific category of functional variants are of interest to be grouped. For example, if grouping nonsynonymous and splicing variants are of interests, the following should be included in command line:
* (! only available after v4.13.8) when --annotation is not specified, raremetal groups all non-intergenic variants.
--annotatedVcf your.annotated.vcf --annotation nonsyn/splicing
* Notice that each variant is allowed to have more than one annotations; but each annotation should start with a new key "ANNO=" followed by annotation:genename:other transcript information.
 === Generate a VCF File to Annotate Outside of Rare Metal ===* --writeVCF allows user to write a VCF file including pooled single variants from all studies. Then users can use their favorite annotation tool to annotate the VCF file. After annotating the VCF file, users can use that file as input for '''RAREMETAL''' for further gene-based or region-based meta analysis.* The output vcf Generated group file will be name as: yourPrefix.pooled.variants.vcf. An example output vcf file is in the following: #CHROM POS ID REF ALT QUAL FILTER INFO 1 115658497 115658497 G A . . ALT_AF=0.380906; 2 74688884 74688884 G A . . ALT_AF=8.33611e-05; 3 121414217 121414217 C A . named test. ALT_AF=0groupfile under your running directory.0747833;
===Options for Report Generation===
--tabulateHits [false]
--hitsCutoff [1e-06]
--dosage [false]
--altMAF [false]
==Example Command lines==
==CONTACT==
Please email Shuang Feng Andy Boughton (sfengsph abought at umich dot edu) for questions.
== Change Log ==* Version 0.0.1 released to U of M CSG group. (2/13/2013)* Version 0.0.1 released. (2/24/2013)* Version 0.1.2 released after fixing a few bugs, adding conditional analysis and automatic graphing to the tool. (8/5/2013)* Version 0.2.9 released after fixing a bug in SKAT Also check [[Raremetal Incoming updates | '''Known issues and writing PDF when all variants are monomorphic. (10/7/2013)* Version 0.3.1 released to fix a bug when one of the alleles coded as missing.* Version 0.4.0 released with a few bugs fixed to properly handling missing genotypes. Major change incoming update in command options. Now allow user next version''']] to specify list of summary statistics files and covariance files separately using --summaryFiles and --covFiles options.* Version 0.4.2 released with a bug fixed for SKAT when there are two variants in a group, and a bug fixed in Makefile for easy compiling.* Version 0.4.4 released with a bug fixed when alleles are flipped in group file. (3/14/2014)see if your problem has been reported before
30
edits

Navigation menu