Changes

From Genome Analysis Wiki
Jump to navigationJump to search
3,060 bytes removed ,  13:34, 20 November 2019
Line 1: Line 1: −
'''NOTICE: This site is currently being actively edited as of 4/9/2014. This notice will be delete once this page is done updating. '''
+
[[Category:RAREMETALWORKER]]
   −
'''RAREMETALWORKER''' is a tool for generating summary statistics for rare variants and gene level meta analyses using [http://genome.sph.umich.edu/wiki/RAREMETAL '''RAREMETAL'''].
+
'''RAREMETALWORKER''' is a tool for single variant analysis, generating summary statistics for gene level meta analyses in [http://genome.sph.umich.edu/wiki/RAREMETAL '''RAREMETAL'''].
    
If you feel this program is useful, please tell us your name and contact in this [https://docs.google.com/spreadsheet/ccc?key=0AuYjznTeEDYudFpqUk9sQ2pkN3d3endjYldqMEp6ZUE&usp=sharing '''registration'''].
 
If you feel this program is useful, please tell us your name and contact in this [https://docs.google.com/spreadsheet/ccc?key=0AuYjznTeEDYudFpqUk9sQ2pkN3d3endjYldqMEp6ZUE&usp=sharing '''registration'''].
   −
If you have any questions, please contact [[Shuang Feng|'''Shuang Feng''']] sfengsph at umich dot edu or [[Goncalo_Abecasis | '''Goncalo Abecasis''']] goncalo at umich dot edu.
+
If you have any questions, please contact Sai Chen (saichen at umich dot edu) or [[Goncalo_Abecasis | '''Goncalo Abecasis''']] (goncalo at umich dot edu).
      Line 14: Line 14:  
* The [[RAREMETALWORKER_method | '''RAREMETALWORKER method''']]
 
* The [[RAREMETALWORKER_method | '''RAREMETALWORKER method''']]
 
* The [[Tutorial:_RAREMETAL| '''RAREMETALWORKER quick start tutorial''']]
 
* The [[Tutorial:_RAREMETAL| '''RAREMETALWORKER quick start tutorial''']]
 +
* The [[RAREMETALWORKER_SPECIAL_TOPICS | '''RAREMETALWORKER special topics''']]
 
* The [[RAREMETAL_Documentation | '''RAREMETAL documentation''']]
 
* The [[RAREMETAL_Documentation | '''RAREMETAL documentation''']]
 
* The [[RAREMETAL FAQ | '''FAQ''']]
 
* The [[RAREMETAL FAQ | '''FAQ''']]
 +
* The [[RAREMETAL_Change_Log | '''Change Log''']]
    
== Key Features ==
 
== Key Features ==
Line 30: Line 32:  
== Software Download and Installation ==
 
== Software Download and Installation ==
   −
=== Where to Download ===
+
=== DOWNLOAD ===
   −
* The source package for Linux can be downloaded here: [[Media:Raremetalworker.0.4.8.tgz ‎|'''RAREMTALWORKER''']] (The source code will be updated shortly. Please use the binary below, which has been believed to be correct. Please email sfengsph@umich.edu if you had any questions.)
+
We have tested compilation on several platforms including Linux, MAC OS X, and Windows.  
* The binary of RAREMETALWORKER can also be downloaded here: [[Media:RMW.0.4.8.beta.binary.tgz|'''RAREMETALWORKER BINARY''']]
  −
* If you prefer to start from the source files, you can start from decompress the package using the following command:
  −
  tar xvzf Raremetalworker.0.4.8.tgz
  −
* For UM CSG cluster users, no installation is needed. It is available at /net/fantasia/home/sfengsph/code/Rare-Metal/RareMetalWorker/bin/raremetalworker
     −
=== How to Compile ===
+
For source code and executables together with instructions of building from source, please go to [[RAREMETAL_DOWNLOAD_%26_BUILD |'''DOWNLOAD source and executables''']].
   −
* Go to /RareMetalWorker_0.4.8/RareMetalWorker/src and use the following command:
+
For questions about building and compilation, please go to [[RAREMETAL_FAQ | '''FAQ''']].
  −
  make
  −
* If you prefer to use the binary file downloaded above, then no compiling is needed, but it is not guaranteed to work due to system and library requirements.
  −
 
  −
For compiling questions, please go to [http://genome.sph.umich.edu/wiki/RAREMETAL_FAQ ''' FAQ'''] for more information.
      
=== How to Execute ===
 
=== How to Execute ===
Line 56: Line 49:     
Method description and key formulae can be found in [http://genome.sph.umich.edu/wiki/RAREMETALWORKER_method '''RAREMETALWORKER METHOD'''].
 
Method description and key formulae can be found in [http://genome.sph.umich.edu/wiki/RAREMETALWORKER_method '''RAREMETALWORKER METHOD'''].
 +
 +
==For Binary Traits==
 +
 +
RAREMETALWORKER currently treat all traits as quantitative. If your trait is binary, the odds ratio can be approximated from effect size estimates generated by RAREMETALWORKER. The installation/source package has a script included to augment the odds ratio estimates to the last column of the RAREMETALWORKER output. For details, please refer to [[RAREMETAL_DOWNLOAD_%26_BUILD#Calculating_Odds_Ratio_from_RAREMETALWORKER_output | '''Calculate Odds Ratio from RAREMETALWORKER output''']].
    
== Software Specifications ==
 
== Software Specifications ==
Line 65: Line 62:  
For detailed description of command options, please go to [[RAREMETALWORKER_command_reference | '''command reference''']].
 
For detailed description of command options, please go to [[RAREMETALWORKER_command_reference | '''command reference''']].
   −
  RAREMETALWORKER 0.4.8 -- A Forerunner of RareMetal
+
  Options:
          (c) 2012-2014 Shuang Feng, Dajiang Liu, Goncalo Abecasis
+
      Input Files : --ped [], --dat [], --vcf [], --dosage, --noeof
+
      Output Files : --prefix [], --LDwindow [1000000], --zip, --thin,
  Please go to "http://genome.sph.umich.edu/wiki/RAREMETALWORKER#Where_to_Download" for the latest version.
+
                    --labelHits
+
        VC Options : --vcX, --separateX
  Options:
+
    Trait Options : --makeResiduals, --inverseNormal, --traitName []
        Input Files : --ped [], --dat [], --vcf [], --dosage, --noeof
+
    Model Options : --recessive, --dominant
      Output Files : --prefix [], --LDwindow [1000000], --zip, --thin,
+
    Kinship Source : --kinPedigree, --kinGeno, --kinFile [], --kinxFile [],
                      --labelHits
+
                    --kinSave
        VC Options : --vcX, --separateX
+
  Kinship Options : --kinMaf [0.05], --kinMiss [0.05]
      Trait Options : --makeResiduals, --inverseNormal, --traitName []
+
      Chromosome X : --xLabel [X], --xStart [2699520], --xEnd [154931044],
      Model Options : --recessive, --dominant
+
                    --maleLabel [1], --femaleLabel [2]
    Kinship Source : --kinPedigree, --kinGeno, --kinFile [], --kinxFile [],
+
            others : --cpu [1], --kinOnly,
                      --kinSave
+
                    --geneMap [../data/refFlat_hg19.txt]
    Kinship Options : --kinMaf [0.05], --kinMiss [0.05]
+
        PhoneHome : --noPhoneHome, --phoneHomeThinning [100]
      Chromosome X : --xLabel [X], --xStart [2699520], --xEnd [154931044],
  −
                      --maleLabel [1], --femaleLabel [2]
  −
          PhoneHome : --noPhoneHome, --phoneHomeThinning [100]
      
===INPUT FILE FORMAT===
 
===INPUT FILE FORMAT===
Line 91: Line 85:  
* When PED file has genotypes saved, there is no need for a VCF file as input.
 
* When PED file has genotypes saved, there is no need for a VCF file as input.
 
* RMW takes PED/DAT file in Merlin format. Please refer to [http://www.sph.umich.edu/csg/abecasis/merlin/tour/input_files.html PED/DAT format description] for details.
 
* RMW takes PED/DAT file in Merlin format. Please refer to [http://www.sph.umich.edu/csg/abecasis/merlin/tour/input_files.html PED/DAT format description] for details.
 +
* PED file requires "dummy" parents to be included in the pedigree file. To check the integrity of your PED/DAT file, please use [http://www.sph.umich.edu/csg/abecasis/PedStats '''pedstats''']. To add dummy parents into the pedigree, please use the [[Media:Script.tgz | '''perl script''']].
 
* An example PED file is in the following:
 
* An example PED file is in the following:
 
     1 1 0 0 1 1.5 1 23 A A A A A A A A A A
 
     1 1 0 0 1 1.5 1 23 A A A A A A A A A A
Line 109: Line 104:  
* '''Markers in PED and DAT file must be sorted by chromosome and position.'''
 
* '''Markers in PED and DAT file must be sorted by chromosome and position.'''
   −
* Covariate and trait values are saved in PED file. Covariate and trait descriptions are saved in DAT file.
+
* Covariate and trait values are saved in PED file. Covariate and trait descriptions are saved in DAT file. Note that you must specify <code>--makeResiduals</code> in order to adjust the covariates out of the phenotype. See [[RAREMETALWORKER#Example_Command_Lines | Example Command Lines]] for examples and [[RAREMETALWORKER_command_reference#Trait_Options | Trait Options]] for more information.
    
==== VCF File ====
 
==== VCF File ====
Line 146: Line 141:  
* If --zip option is used, then the following will be generated automatically:
 
* If --zip option is used, then the following will be generated automatically:
 
   prefix.traitName.singlevar.score.txt.gz
 
   prefix.traitName.singlevar.score.txt.gz
 +
  prefix.traitName.singlevar.score.txt.gz.tbi
 
   prefix.traitName.singlevar.cov.txt.gz
 
   prefix.traitName.singlevar.cov.txt.gz
 +
  prefix.traitName.singlevar.cov.txt.gz.tbi
 
   prefix.singlevar.log
 
   prefix.singlevar.log
   Line 166: Line 163:  
* In the file with summary statistics named prefix.traitName.singlevar.score.txt contains summary statistics that are needed by Rare-Metal. An example is shown in below:
 
* In the file with summary statistics named prefix.traitName.singlevar.score.txt contains summary statistics that are needed by Rare-Metal. An example is shown in below:
   −
  LDL mean= -0.00, variance=  1.00, heritability= 34.30
+
LDL mean= -0.00, variance=  1.00, heritability= 34.30  
  CHR     POS     REF_ALLELE     ALT_ALLELE     INFORMATIVE_N   FOUNDER_AF     ALL_AF  INFORMATIVE_AC  HWE_PVALUE      STAT   ALT_ALLELE_EFFSIZE     PVALUE
+
CHR       POS REF_ALLELE ALT_ALLELE INFORMATIVE_N FOUNDER_AF   ALL_AF  INFORMATIVE_AC  HWE_PVALUE      STAT ALT_ALLELE_EFFSIZE       PVALUE
   10   45410002       G       A       6103    0.0341589      0.0341589      410     0.165893       126.205 0.309798       4.03074e-10
+
   10 45410002         G         A           6103    0.034159  0.034159            410   0.165893 126.2050            0.309798 4.030740e-10
   19   45412079       G       A       6103    0.0368124      0.0368124      434     0.714645       -265.84 -0.587356       7.87851e-36
+
   19 45412079         G         A           6103    0.036812  0.036812            434   0.714645 -265.8400          -0.587356 7.878510e-36
   19   45414451       G       A       6103    0.444989       0.444989       5312    0.0759271      -26.1212       -0.00837122    0.640058
+
   19 45414451         G         A           6103    0.444989 0.444989           5312    0.075927  -26.1212           -0.008371  6.400580e-01
 +
 
    
* pvalues from the above output are from the family-based single variant score test.
 
* pvalues from the above output are from the family-based single variant score test.
Line 177: Line 175:  
* prefix.traitName.singlevar.cov.txt contains the LD matrix among a variant and the adjacent markers within a prefixed-sized window. The default window size is 1MB. It has the following format:
 
* prefix.traitName.singlevar.cov.txt contains the LD matrix among a variant and the adjacent markers within a prefixed-sized window. The default window size is 1MB. It has the following format:
 
    
 
    
  CHR   POS       VAR_POS_IN_WINDOW                             LD_MATRIX
+
CHR     POS                           VAR_POS_IN_WINDOW                                                                 LD_MATRIX
  762320    762320,865628,865665,878744,879381,1560000   0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077,
+
  1 762320   762320,865628,865665,878744,879381,1560000 0.0359084,-0.000242112,-0.00125797,-0.000993422,-0.000344509,-0.00017077,
  1   865628     865628,865665,878744,879381,1560000,1864659   0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183,
+
  1 865628 865628,865665,878744,879381,1560000,1864659           0.419804,-0.0103663,-0.00635265,0.0594056,0.0534505,-0.00462183,
  1   878744     878744,879381,1560000,1864659,1877659         0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05,
+
  1 878744       878744,879381,1560000,1864659,1877659             0.000404537,-0.000235215,-1.4455e-05,-8.69137e-06,-3.1027e-05,
    
=====Genomic Relationship Matrix (GRM)=====
 
=====Genomic Relationship Matrix (GRM)=====
Line 186: Line 184:  
* Once --kinGeno --kinSave --prefix options are requested, you would expect to see a GRM generated (compressed by gzip) with name yourprefix.Empirical.Kinship.gz. If --prefix option is not used, then the file name is Empirical.Kinship.gz.  
 
* Once --kinGeno --kinSave --prefix options are requested, you would expect to see a GRM generated (compressed by gzip) with name yourprefix.Empirical.Kinship.gz. If --prefix option is not used, then the file name is Empirical.Kinship.gz.  
 
* If --vcX --kinGeno --kinSave --prefix options are requested, besides the autosomal GRM, you would also expect to see a separate GRM for chromosome X saved (compressed by gzip also) under the name yourprefix.Empirical.KinshipX.gz.  
 
* If --vcX --kinGeno --kinSave --prefix options are requested, besides the autosomal GRM, you would also expect to see a separate GRM for chromosome X saved (compressed by gzip also) under the name yourprefix.Empirical.KinshipX.gz.  
* The GRMs are generated based on all genotyped individuals included in the PED file; samples with missing phenotype or missing covariates are not excluded from GRMs. This feature makes GRMs reusable if you have multiple traits to analyze in separate runs. You can simple use --kinFile option (--kinxFile option if you have X chromosome GRM together with --vcX option issued) to reuse the pre-saved GRMs.
+
* The GRMs are generated based on all genotyped individuals included in the PED file; samples with missing phenotype or missing covariates are not excluded from GRMs. This feature makes GRMs reusable if you have multiple traits to analyze in separate runs. You can simplely use --kinFile option (--kinxFile option if you have X chromosome GRM together with --vcX option issued) to reuse the pre-saved GRMs.
 
* The format for both autosomal and chromosome X GRMs are the same. The first row has all sample IDs (sample size=N) listed. The rest of the file is a symmetric matrix with dimention ''NxN'', and element ''ij'' of this matrix represents the kinship between the <math>i^{th}</math> and the <math>j^{th}</math> sample whose ID can be found from the first row.
 
* The format for both autosomal and chromosome X GRMs are the same. The first row has all sample IDs (sample size=N) listed. The rest of the file is a symmetric matrix with dimention ''NxN'', and element ''ij'' of this matrix represents the kinship between the <math>i^{th}</math> and the <math>j^{th}</math> sample whose ID can be found from the first row.
 
* For details about GRM calculation, please refer to [[RAREMETALWORKER_method | '''method''']].
 
* For details about GRM calculation, please refer to [[RAREMETALWORKER_method | '''method''']].
Line 334: Line 332:     
   [http://genome.sph.umich.edu/wiki/Tutorial:_RareMETAL '''RAREMETAL and RAREMETALWORKER Tutorial''']
 
   [http://genome.sph.umich.edu/wiki/Tutorial:_RareMETAL '''RAREMETAL and RAREMETALWORKER Tutorial''']
  −
== Change Log ==
  −
* Version 0.0.1 was released on 11/13/2012.
  −
* Modified Rare-Metal-Worker to let it output LD matrix by a sliding window. (11/14/2012)
  −
* Uploaded to public wiki. (11/16/2012)
  −
* Enabled writing log file by defalut. (11/18/2012)
  −
* Forced sample IDs to be matched when reading in kinship from a file. Perform a sanity check before reading in kinship file. If a sample of interest is not included in kinship file, then fatal error will occur. (11/19/2012)
  −
* Added HWE pvalue and call rate in summary statistics output. (11/27/2012)
  −
* Bugs fixed to solve compiling errors on some machines (Thank you Mary Kate!). Version 0.0.2 released. (11/30/2012)
  −
* Updated output format. Version 0.0.3 released. (12/3/2012)
  −
* More messages coded into log file. (12/4/2012)
  −
* Version 0.0.4 released. (12/5/2012)
  −
* Bug fixed for empirical kinship calculation when genotypes are read from VCF file. Version 0.0.5 released. (12/6/2012)
  −
* Version 0.0.6 released. (12/6/2012)
  −
* Updated output format for monomorphic sites. (12/7/2012)
  −
* Changed executable name into bin/raremetalworker. Version 0.0.7 released. (12/10/2012)
  −
* Fixed a bug when reading vcf file with ref or alt allele is missing. (2/5/2013)
  −
* Fixed a bug when there is missing genotype from VCF file. (2/2013)
  −
* Fixed a bug when handling chromosome X. Added sex labels option. (3/2/2013)
  −
* Optimized code to speed up the process of calculating empirical kinship. (3/3/2013)
  −
* Updated code to report allele frequencies calculated only from selected samples. (3/3/2013)
  −
* Fixed bug in handling chromosome X. Added sanity checking steps before analysis. Added graphic support by generating QQ and manhattan plots automatically. Upgraded tool to version 2.8. (till 8/12/2013)
  −
* Added support for analyzing dosages from VCF in version 2.9. (8/27/2013)
  −
* Fixed the bug which causes crash when writing PDF when all variants are monomorphic. (10/6/2013)
  −
* Fixed a few bugs handling chromosome X. Generated warning messages when male genotypes are coded wrong in VCF file. (11/25/2013)
  −
* Released version 0.3.6 and fixed a minor bug that caused by code upgrades from version 0.3.5. (12/4/2013)
  −
* Released version 0.3.7. Added dominant and recessive models as options. The default model is additive. (1/7/2014)
  −
* Released version 0.4.0. Added phone home function. Saved Recessive and dominant results in separate files.
  −
* Released version 0.4.1. Fixed a bug handling variants in nonPAR region on chromosome X when all samples are male.
  −
* Released version 0.4.2. Fixed a bug that could possibly cause compiling error in some Linux system. Also, in this version, male heterozygous genotypes on chromosome X are considered missing. (3/10/2014)
  −
* Released version 0.4.3. Fixed a few typo in messages. Added --noeof option for VCF files that does not have a BGZF EOF marker.
  −
* Released version 0.4.4. Fixed hwe=0.0 issue for monomorphic sites. (3/17/2014)
  −
* Released version 0.4.5. Fixed a bug when generating plots for recessive results. (3/18/2014)
  −
* Released version 0.4.6 binary. Fixed a bug in recessive and dominant results. (4/1/2014)
  −
* Released version 0.4.7 binary and source code. Optimized code to increase analysis efficiency and reduce memory use. Added --separateX option to provide another choice of analyzing chromosome X. Added --kinxFile option to allow kinship X matrix file with different prefix from the autosomal kinship.(4/2/2014)
 
239

edits

Navigation menu