Difference between revisions of "RAREMETALWORKER command reference"

From Genome Analysis Wiki
Jump to: navigation, search
(Useful Links)
Line 34: Line 34:
 
* --dosage must be used with --vcf option.
 
* --dosage must be used with --vcf option.
 
* Description of dosage format in a VCF file can be found in [[RAREMETALWORKER#DOSAGE | '''dosage''']].
 
* Description of dosage format in a VCF file can be found in [[RAREMETALWORKER#DOSAGE | '''dosage''']].
 
 
===--noeof===
 
===--noeof===
 +
* If you VCF file does not have the BGZF EOF markers, you should use --noeof option to let RAREMETALWORKER skip checking the BGZF EOF markers at the end of the file. Please see [[RAREMETAL_FAQ#error:_BGZF_EOF_marker_is_missing_in_xxx.vcf.gz | '''BGZF EOF''']] for more details.
  
 
==Output Files==
 
==Output Files==

Revision as of 12:18, 14 April 2014

Useful Links

Here are some useful links to key pages:

List of Options

 Options:
       Input Files : --ped [], --dat [], --vcf [], --dosage, --noeof
      Output Files : --prefix [], --LDwindow [1000000], --zip, --thin,
                     --labelHits
        VC Options : --vcX, --separateX
     Trait Options : --makeResiduals, --inverseNormal, --traitName []
     Model Options : --recessive, --dominant
    Kinship Source : --kinPedigree, --kinGeno, --kinFile [], --kinxFile [],
                     --kinSave
   Kinship Options : --kinMaf [0.05], --kinMiss [0.05]
      Chromosome X : --xLabel [X], --xStart [2699520], --xEnd [154931044],
                     --maleLabel [1], --femaleLabel [2]
         PhoneHome : --noPhoneHome, --phoneHomeThinning [100]

Input Files

--ped

--dat

--vcf

  • --vcf takes a string of your VCF file name.

--dosage

  • When --dosage is issued in command line, RAREMETALWORKER reads dosage from your VCF file.
  • --dosage must be used with --vcf option.
  • Description of dosage format in a VCF file can be found in dosage.

--noeof

  • If you VCF file does not have the BGZF EOF markers, you should use --noeof option to let RAREMETALWORKER skip checking the BGZF EOF markers at the end of the file. Please see BGZF EOF for more details.

Output Files

VC Options

Trait Options

Model Options

Kinship Source

Kinship Options

Chromosome X

PhoneHome

  • --prefix is optional.
  • If --prefix is not specified, the output file names will be:
 traitname.singlevar.score.txt
 traitname.singlevar.cov.txt
  • Otherwise, the output file names are:
 prefix.traitname.singlevar.score.txt
 prefix.traitname.singlevar.cov.txt
  • --LDwindow specifies the length of the window that LD Matrix should be generated upon each variant. The default is 1MB.
  • --zip gives users the option of writing compressed files (bgzip compressed) automatically for convenient sharing.
  • --thin tells RMW to thin points when generating QQ plot and Manhattan plots, so the file size is smaller.
  • --labelHits tells RMW to to label the hits using pvalue threshold 0.05/(#of variants tested) with gene name, based on human genome build 19.

VC Options

  • When --vcShared and --vcX are specified, RMW knows that you want to fit shared environment and/or chromosome X variance component together with genetic component and non-shared environment.
  • When --makeResiduals is specified, RMW understands covariates should be read from PED/DAT file. Covariates are modeled as fixed effects.

Trait Options

  • --makeResiduals tells RMW to adjust the covariates and analyze residuals instead of the original phenotypes. If either --kinGeno or --kinPedigree option is used, then a variance component model will be fit based on residuals. If the --inverseNormal option is also used, then the residuals will be quantile normalized before fitting variance component model.
  • --traitName is created for situations when you have many traits saved in your PED and DAT file, but you are interested in one or a few of them. It can read a file ending with .txt with each trait of interest in a separate line, or trait names separated with "/". An example to handle one trait or multiple traits is in the following:
  --traitName LDL
  --traitName LDL/HDL/TG
  --traitName traitsOfInterest.txt
  • If --traitName is not used, all traits in PED/DAT file will be analyzed.

Model Options

  • additive model is used in RMW as default.
  • --recessive allows additional association results (pvalue, effect size, and standard error) generated using recessive model. If VCF file is used, then non-reference allele is considered the recessive allele. If PED/DAT files are used for genotype, then minor allele is considered the recessive allele.
  • --dominant allows additional association results (pvalue, effect size, and standard error) generated using dominant model. If VCF file is used, then non-reference allele is considered the dominant allele. If PED/DAT files are used for genotype, then minor allele is considered the dominant allele.
  • --recessive and --dominant options can be used together.
  • Recessive and dominant results are stored in separate files.

Kinship Source

  • --kinPedigree allows RMW to generate kinship matrix from pedigree, when pedigree information is available.
  • --kinGeno informs RMW to generate kinship matrix from all available variants that pass the criteria, specified in --kinMaf and --kinMiss options. The default will take variants with MAF>0.05 and genotype missing rate <0.05.
  • --kinGeno option can NOT be used with --kinPedigree or --kinFile option. Only one of three options or none of them can be used in the same run.
  • --kinFile let RMW read in a kinship matrix from a file. The first row of the kinship file has to be the sample IDs included in the kinship file. If a sample of interest is not included in the kinship file, fatal error will occur and the program will be terminated. A sample of interest is a sample that is phenotyped and has all covariates measured when --makeResiduals is specified.
  • --kinSave allows you to save the kinship matrix.

Kinship Options

  • --kinMiss and --kinMaf should be used with --kinGeno together.
  • --kinMiss specifies the maximum genotype missing rate when calculating kinship from genotypes. The default is 0.05.
  • --kinMaf specifies the minimum minor allele frequency used when calculating kinship from genotypes. The default is 0.05.

Chromosome X

  • --xLabel should have a value of a string which specifies how variants on chromosome X are coded. The default is "X".
  • --xStart and --xEnd specifies the start and end of non-pseudo-autosomal regions on chromosome X. These options should be specified when --vcX is used.
  • The default for --xStart is 2699520 and default for --xEnd is 154931044, according to NCBI genome build 37.

Please refer to the following for the analysis of X-linked variants ANALYZING CHROMOSOME X.

PhoneHome Parameters

See PhoneHome for more information on how PhoneHome works and what it does.

  • --noPhoneHome disables PhoneHome. PhoneHome is enabled by default based on the thinning parameter.
  • --phoneHomeThinning (0-100) adjusts the frequency of PhoneHome.
    • By default, --phoneHomeThinning is set to 50, running 50% of the time.
    • PhoneHome will only occur if the run's random number modulo 100 is less than the --phoneHomeThinning value.
    • N/A if --noPhoneHome is set.