RvTests


Overview

(See [rvtests] for more powerful rare-variant association test software and/or preparation for meta-analysis)

A few rare variants tests (Li-Leal's CMC and Madsen-Browning's weighted method) are implemented in the logisitc regression framework using C++. Please contact Youna Hu (youna@umich.edu) for comments, suggestions or questions.

The source code is located at File:RV3Tests.v1.tar

You can just download the tar file, extract it and go to the RV3Test.v1 folder and type make all to compile the code, the binary file will then be in the exectuables folder.

Example

See a detailed example here.

Syntax

This software uses command line interface as follows

RARE VARIANT ANALYSIS OPTIONS:

                GENOTYPE : --genofile [pos.012],
                           --geneList [outGeneSorted.txt], --cutoff [0.010],
                           --collapseChoice [or]
               PHENOTYPE : --phenofile [LDL.y.ID]
              COVARIATES : --covConsider, --covfile [covFile.ID.2.txt]
             PERMUTATION : --nPermute [10], --PermutationSeed [1]
  GENE LEVEL TEST RESULT : --geneGlobalTestOut [globalPermuteSummary.txt],
                           --geneTestpvalueFile [geneTestPvalues.txt]


GENOTYPE
--genofile
A genotype 012 matrix (.012 is the file) This file can be prepared by using the prepare012s
 source code File:VcfReader.v1.tar
 Again, extract the tar file and then go into the directory to type make all to compile the code,
           you then will find binary file in the executables folder. 
Note: You should use Yanming's vcf annotation [1] on your vcf file first 
to output a annotated vcf file. You SHOULD keep the log file from the annotation, which will be used to create the gene list.

Data File PREPARATION

         Input files : --vcf [LDL.test.vcf], --log [], --IDfile []
  Subsetting choices : --All
        Output files : --outputPrefix [subsetGeno],
                       --outputGeneList [LDL.geneList.txt]
  --vcf: Input vcf file 
  --log: This is the log file from Yanming's annotation output, we use this log to obtain the gene list
  --IDfile specifies a file with one column of subject IDs to subject from the vcf file. 
   If it is not specified, then all subjects are included for the format conversion.
  --All:  specifies 1 to include all variants and 0 to include only nonsyn and stop annotated variants.
  -- outputPrefix: Specify the prefix for the four output files which will be used in rvTests
  *.012: A genotype matrix with subjects as rows and variant sites as columns.
  *.012.pos: Chromosome and position numbers. 
  *.012.indv: Subject IDs.
  *.012.frq: The frequency of the included variants.
 --outputGeneList:  Specify a file to store the gene list which will be used in rvTest.
 The list file looks like this 
 1	OR4F5	69090	70008
 1	SAMD11	860529	871276
 1	NOC2L	879583	893918
 1	KLHL17	895966	901095
 1	PLEKHN1	901876	910482
 1	C1orf170	910578	912021
--geneList
This file is an output from prepare012s using the option --outputGeneList with columns as chromosome number, gene Name, start position, end position. There should be no header for this file.

THE CHROMOSOME NUMBERS SHOULD BE NUMERICS!!!! 1 - chromosome 1, DO NOT USE chr1.

--cutoff
This is the minor allele frequency, you can specify it as 0.01, 0.05 or etc.
--collapseChoice
Specify one of {or,sum,wt}. or: Li-Leal's CMC test, sum: Use the number of rare variants for each subject as the score, wt: Madeson-Browning's weighted rare variant score.
PHENOTYPE
--phenofile
A file where the first column is subject ID and the second column is phenotype (0 or 1).
COVARIATES
--covConsider
Default = 0, no covariate is considered. 1. covariate is considered.
--covfile
Covariate file with the first column as subject ID and the other columns are covariates needed to be considered in the model.
PERMUTATION
--nPermute
Number of permutation for the evaluation of p values.
-- PermutationSeed
Default = 1. Can be changed to other numbers too.
GENE LEVEL TEST RESULT
--geneGlobalTestOut
This file stores the 5% and 95% quantiles of the p values for all the genes at each permutation
--geneTestPvalueFile
This file gives you the gene name, number of rare variants, count of variants in case/control and p values from the RV test specified by collapseChoice.