Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,075 bytes added ,  10:34, 21 February 2017
Line 8: Line 8:  
== Brief Description ==
 
== Brief Description ==
   −
'''famrvtest''' is a computationally efficient tool for family-based association analyses of rare variants using sequencing or genotyping array data. '''famrvtest''' supports both single variant and gene-level associations.  
+
'''famrvtest''' is a computationally efficient tool for family-based rare variant association analyses using genotyping array or sequencing data. '''famrvtest''' supports both single variant and gene-level associations.  
   −
For any questions, please contact Shuang Feng (sfengsph at umich.edu) or Gonçalo Abecasis (goncalo at umich.edu).
+
For any questions, please contact [[Shuang_Feng |Shuang Feng]] (sfengsph at umich.edu) or [[Goncalo_Abecasis|Gonçalo Abecasis]] (goncalo at umich.edu).
    
== Download and Installation ==
 
== Download and Installation ==
Line 19: Line 19:  
* Source code can be downloaded in the following
 
* Source code can be downloaded in the following
 
   
 
   
   [[Media:LINUX_famRvTest.2.0.tgz|Source for '''LINUX''']]
+
   [[Media:LINUX_famrvtest.2.4.tgz|Source for '''LINUX''']]
   [[Media:MAC_famRvTest.2.0.tgz|Source for '''MAC''']]
+
   [[Media:MAC_famrvtest.2.4.tgz|Source for '''MAC''']]
   [[Media:MINGW_famRvTest.2.0.tgz|Source for '''MINGW''']]
+
   [[Media:MINGW_famrvtest.2.4.tgz|Source for '''MINGW''']]
   [[Media:CYGWIN64_famRvTest.2.0.tgz|Source for '''CYGWIN64''']]
+
   [[Media:CYGWIN64_famrvtest.2.4.tgz|Source for '''CYGWIN64''']]
    
* Executable can be downloaded in the following:
 
* Executable can be downloaded in the following:
   −
   [[Media:Linux_binary.tar.gz |Executable for '''LINUX''']]
+
   [[Media:Famrvtest.2.4.linux.executable.tgz |Executable for '''LINUX''']]
    
=== How to Compile ===
 
=== How to Compile ===
 
* Save it to your local path and decompress using the following command:
 
* Save it to your local path and decompress using the following command:
   tar xvzf Linux_famRvTest.2.0.tgz
+
   tar xvzf LINUX_famrvtest.2.4.tgz
 
* Go to promp>famrvtest and type the following command to compile:
 
* Go to promp>famrvtest and type the following command to compile:
   make -j 5
+
   make
    
=== How to Execute ===
 
=== How to Execute ===
Line 45: Line 45:     
== Input Files ==
 
== Input Files ==
famRvTest needs the following files as input: PED and DAT file in Merlin format, '''AND/OR''' a VCF file. When genotypes are stored in PED and DAT file, the VCF file is not needed. However, even if genotypes are saved in a VCF file, PED and DAT files are still needed for carrying covariate and trait information.  
+
famrvtest needs the following files as input: PED and DAT file in Merlin format, '''AND/OR''' a VCF file. When genotypes are stored in PED and DAT file, the VCF file is not needed. However, even if genotypes are saved in a VCF file, PED and DAT files are still needed for carrying covariate and trait information.  
    
=== PED and DAT Files ===
 
=== PED and DAT Files ===
 
* When PED file has genotypes saved, there is no need for a VCF file as input.
 
* When PED file has genotypes saved, there is no need for a VCF file as input.
* '''famRvTest''' takes PED/DAT file in [http://www.sph.umich.edu/csg/abecasis/Merlin/index.html|'''Merlin'''] format. Please refer to [http://www.sph.umich.edu/csg/abecasis/merlin/tour/input_files.html PED/DAT format description] for details.
+
* '''famrvtest''' takes PED/DAT file in [http://www.sph.umich.edu/csg/abecasis/Merlin/index.html '''Merlin'''] format. Please refer to [http://sph.umich.edu/csg/abecasis/merlin/tour/input_files.html PED/DAT format description] for details.
 
* An example PED file is in the following:
 
* An example PED file is in the following:
 
     1 1 0 0 1 1.5 1 23 A A A A A A A A A A
 
     1 1 0 0 1 1.5 1 23 A A A A A A A A A A
Line 76: Line 76:  
   tabix -p vcf -f input.vcf.gz  ## this command will generate input.vcf.gz.tbi
 
   tabix -p vcf -f input.vcf.gz  ## this command will generate input.vcf.gz.tbi
 
* Even with the presence of VCF file, PED/DAT files are still needed for covariates and phenotypes.
 
* Even with the presence of VCF file, PED/DAT files are still needed for covariates and phenotypes.
 +
 +
=== Group File for Gene-level Tests===
 +
* Grouping methods are only necessary for gene-level tests.
 +
* With --groupFile option, you can specify particular set of variants to be grouped for burden tests.
 +
* The group file must be a tab or space delimited file in the following format:
 +
  GROUP_ID MARKER1_ID MARKER2_ID MARKER3_ID ...
 +
* MARKER_ID must be in the following format:
 +
  CHR:POS:ALLELE1:ALLELE2
 +
* An example group file is:
 +
  PLEKHN1 1:901922:G:A    1:901923:C:A    1:902088:G:A    1:902128:C:T    1:902133:C:G    1:902176:C:T    1:905669:C:G       
 +
  HES4    1:934735:A:C    1:934770:G:A    1:934801:C:T    1:935085:G:A    1:935089:C:G
 +
* '''Version 2.4 and later allow variants from different chromosomes to be grouped for testing. This might be useful for pathway analysis.'''
 +
* '''Note: any variants that have different alleles from listed in group file will be excluded from gene-level tests.'''
    
== Example Command Line ==
 
== Example Command Line ==
 
===Single Variant Analysis===
 
===Single Variant Analysis===
 
The following command lines let you run single variant association analysis of trait "LDL" using score test, after inverse normalization of the quantitative trait and adjusting covariates. --traitName specifies the single trait or traits you want to analyze in this batch. If this option is not used, then all traits coded in data file will be analyzed accordingly. --SingleVarLRT provides essentially the same test as in merlin --fastAssoc option.  
 
The following command lines let you run single variant association analysis of trait "LDL" using score test, after inverse normalization of the quantitative trait and adjusting covariates. --traitName specifies the single trait or traits you want to analyze in this batch. If this option is not used, then all traits coded in data file will be analyzed accordingly. --SingleVarLRT provides essentially the same test as in merlin --fastAssoc option.  
  ./famRvTest -p your.ped -d your.dat --vcf your.vcf.gz --SingleVarScore --inverseNormal --useCovariates --traitName LDL
+
  ./famrvtest --ped your.ped --dat your.dat --vcf your.vcf.gz --SingleVarScore --inverseNormal --useCovariates --traitName LDL
Futhermore, if you want to run likelihood ratio test and wald test at the same time, the following command should do the work:
  −
./famRvTest -p your.ped -d your.dat --SingleVarScore --SingleVarLRT --SingleVarWald --inverseNormal --useCovariates --traitName LDL
      
All the above commands will let you do family-based association analysis using kinship matrices generated using pedigree structure coded in pedigree file. The following command lines show examples of using genotype to estimate empirical relationship matrix to do the work.  
 
All the above commands will let you do family-based association analysis using kinship matrices generated using pedigree structure coded in pedigree file. The following command lines show examples of using genotype to estimate empirical relationship matrix to do the work.  
   ./famRvTest -p your.ped -d your.dat --SingleVarScore --SingleVarLRT --SingleVarWald --inverseNormal --useCovariates --traitName LDL --kinGeno
+
   ./famrvtest --ped  your.ped --dat your.dat --SingleVarScore --inverseNormal --useCovariates --traitName LDL --kinPedigree
    
===Gene-level Association===
 
===Gene-level Association===
    
The following command lines let you run gene-level association analysis of genes listed in "your.genes.groupfile" for trait "LDL" using SKAT, Madsen-Browning weighted burden, rare allele counts un-weighted burden and collapsing burden and variable threshold tests, after inverse normalization of the quantitative trait and adjusting covariates. Only rare variants with maf less than or equal to 0.05 and minor allele count greater than or equal to 3 are grouped.
 
The following command lines let you run gene-level association analysis of genes listed in "your.genes.groupfile" for trait "LDL" using SKAT, Madsen-Browning weighted burden, rare allele counts un-weighted burden and collapsing burden and variable threshold tests, after inverse normalization of the quantitative trait and adjusting covariates. Only rare variants with maf less than or equal to 0.05 and minor allele count greater than or equal to 3 are grouped.
  ./famRvTest -p your.ped -d your.dat --SKAT --MB --CMC_counts --CMC_binary --VTasymptotic --inverseNormal --useCovariates --traitName LDL --groupFile your.genes.groupfile --maxMaf 0.05 --mac 3
+
  ./famrvtest -ped your.ped -dat your.dat --SKAT_BETA --MB --burden --VT --inverseNormal --useCovariates --traitName LDL --groupFile your.genes.groupfile --maf 0.05
    
== Change Log ==
 
== Change Log ==
    
* Released version 0.0.9 with a bug fixed for potential compiling error. (10/10/2013)
 
* Released version 0.0.9 with a bug fixed for potential compiling error. (10/10/2013)
* Released version 2.0, a faster version and added single variant permutation test. (7/14/2014)
+
* Released version 2.0, a faster version and added family-based single variant permutation test. (7/14/2014)
 +
* Released version 2.2, a bug fixed which causes single variant test can not be run alone. (7/15/2014)
 +
* Uploaded new source code package for version2.2, with updated makefiles. (8/4/14)
 +
* Released version 2.3. Fixed a bug which causes compiling error (not finding the correct makefile). (8/20/14)
 +
* Released version 2.4. Enable analyzing pathways where variants from different chromosomes can be grouped. (9/27/2014)
96

edits

Navigation menu