Difference between revisions of "RAREMETALWORKER command reference"

From Genome Analysis Wiki
Jump to navigationJump to search
(Created page with "==to be constructed==")
 
Line 1: Line 1:
==to be constructed==
+
==== Output Files ====
 +
* --prefix is optional.
 +
* If --prefix is not specified, the output file names will be:
 +
  traitname.singlevar.score.txt
 +
  traitname.singlevar.cov.txt
 +
* Otherwise, the output file names are:
 +
  prefix.traitname.singlevar.score.txt
 +
  prefix.traitname.singlevar.cov.txt
 +
* --LDwindow specifies the length of the window that LD Matrix should be generated upon each variant. The default is 1MB.
 +
* --zip gives users the option of writing compressed files (bgzip compressed) automatically for convenient sharing.
 +
* --thin tells RMW to thin points when generating QQ plot and Manhattan plots, so the file size is smaller.
 +
* --labelHits tells RMW to to label the hits using pvalue threshold 0.05/(#of variants tested) with gene name, based on human genome build 19.
 +
 
 +
==== VC Options ====
 +
* When --vcShared and --vcX are specified, RMW knows that you want to fit shared environment and/or chromosome X variance component together with genetic component and non-shared environment.
 +
* When --makeResiduals is specified, RMW understands covariates should be read from PED/DAT file. Covariates are modeled as fixed effects.
 +
 
 +
==== Trait Options ====
 +
* --makeResiduals tells RMW to adjust the covariates and analyze residuals instead of the original phenotypes. If either --kinGeno or --kinPedigree option is used, then a variance component model will be fit based on residuals. If the --inverseNormal option is also used, then the residuals will be quantile normalized before fitting variance component model.
 +
* --traitName is created for situations when you have many traits saved in your PED and DAT file, but you are interested in one or a few of them. It can read a file ending with .txt with each trait of interest in a separate line, or trait names separated with "/". An example to handle one trait or multiple traits is in the following:
 +
  --traitName LDL
 +
  --traitName LDL/HDL/TG
 +
  --traitName traitsOfInterest.txt
 +
* If --traitName is not used, all traits in PED/DAT file will be analyzed.
 +
 
 +
==== Model Options ====
 +
* additive model is used in RMW as default.
 +
* --recessive allows additional association results (pvalue, effect size, and standard error) generated using recessive model. If VCF file is used, then non-reference allele is considered the recessive allele. If PED/DAT files are used for genotype, then minor allele is considered the recessive allele.
 +
* --dominant allows additional association results (pvalue, effect size, and standard error) generated using dominant model. If VCF file is used, then non-reference allele is considered the dominant allele. If PED/DAT files are used for genotype, then minor allele is considered the dominant allele.
 +
* --recessive and --dominant options can be used together.
 +
* Recessive and dominant results are stored in separate files.
 +
 
 +
==== Kinship Source ====
 +
* --kinPedigree allows RMW to generate kinship matrix from pedigree, when pedigree information is available.
 +
* --kinGeno informs RMW to generate kinship matrix from all available variants that pass the criteria, specified in --kinMaf and --kinMiss options. The default will take variants with MAF>0.05 and genotype missing rate <0.05.
 +
* --kinGeno option can NOT be used with --kinPedigree or --kinFile option. Only one of three options or none of them can be used in the same run.
 +
* --kinFile let RMW read in a kinship matrix from a file. The first row of the kinship file has to be the sample IDs included in the kinship file. If a sample of interest is not included in the kinship file, fatal error will occur and the program will be terminated. A sample of interest is a sample that is phenotyped and has all covariates measured when --makeResiduals is specified.
 +
* --kinSave allows you to save the kinship matrix.
 +
 
 +
==== Kinship Options ====
 +
* --kinMiss and --kinMaf should be used with --kinGeno together.
 +
* --kinMiss specifies the maximum genotype missing rate when calculating kinship from genotypes. The default is 0.05.
 +
* --kinMaf specifies the minimum minor allele frequency used when calculating kinship from genotypes. The default is 0.05.
 +
 
 +
==== Chromosome X ====
 +
* --xLabel should have a value of a string which specifies how variants on chromosome X are coded. The default is "X".
 +
* --xStart and --xEnd specifies the start and end of non-pseudo-autosomal regions on chromosome X. These options should be specified when --vcX is used.
 +
* The default for --xStart is 2699520 and default for --xEnd is 154931044, according to NCBI genome build 37.
 +
 
 +
Please refer to the following for the analysis of X-linked variants [[RAREMETALWORKER_X|'''ANALYZING CHROMOSOME X''']].
 +
 
 +
{{PhoneHomeParameters|hdr=====|bullet=1}}

Revision as of 14:03, 9 April 2014

Output Files

  • --prefix is optional.
  • If --prefix is not specified, the output file names will be:
 traitname.singlevar.score.txt
 traitname.singlevar.cov.txt
  • Otherwise, the output file names are:
 prefix.traitname.singlevar.score.txt
 prefix.traitname.singlevar.cov.txt
  • --LDwindow specifies the length of the window that LD Matrix should be generated upon each variant. The default is 1MB.
  • --zip gives users the option of writing compressed files (bgzip compressed) automatically for convenient sharing.
  • --thin tells RMW to thin points when generating QQ plot and Manhattan plots, so the file size is smaller.
  • --labelHits tells RMW to to label the hits using pvalue threshold 0.05/(#of variants tested) with gene name, based on human genome build 19.

VC Options

  • When --vcShared and --vcX are specified, RMW knows that you want to fit shared environment and/or chromosome X variance component together with genetic component and non-shared environment.
  • When --makeResiduals is specified, RMW understands covariates should be read from PED/DAT file. Covariates are modeled as fixed effects.

Trait Options

  • --makeResiduals tells RMW to adjust the covariates and analyze residuals instead of the original phenotypes. If either --kinGeno or --kinPedigree option is used, then a variance component model will be fit based on residuals. If the --inverseNormal option is also used, then the residuals will be quantile normalized before fitting variance component model.
  • --traitName is created for situations when you have many traits saved in your PED and DAT file, but you are interested in one or a few of them. It can read a file ending with .txt with each trait of interest in a separate line, or trait names separated with "/". An example to handle one trait or multiple traits is in the following:
  --traitName LDL
  --traitName LDL/HDL/TG
  --traitName traitsOfInterest.txt
  • If --traitName is not used, all traits in PED/DAT file will be analyzed.

Model Options

  • additive model is used in RMW as default.
  • --recessive allows additional association results (pvalue, effect size, and standard error) generated using recessive model. If VCF file is used, then non-reference allele is considered the recessive allele. If PED/DAT files are used for genotype, then minor allele is considered the recessive allele.
  • --dominant allows additional association results (pvalue, effect size, and standard error) generated using dominant model. If VCF file is used, then non-reference allele is considered the dominant allele. If PED/DAT files are used for genotype, then minor allele is considered the dominant allele.
  • --recessive and --dominant options can be used together.
  • Recessive and dominant results are stored in separate files.

Kinship Source

  • --kinPedigree allows RMW to generate kinship matrix from pedigree, when pedigree information is available.
  • --kinGeno informs RMW to generate kinship matrix from all available variants that pass the criteria, specified in --kinMaf and --kinMiss options. The default will take variants with MAF>0.05 and genotype missing rate <0.05.
  • --kinGeno option can NOT be used with --kinPedigree or --kinFile option. Only one of three options or none of them can be used in the same run.
  • --kinFile let RMW read in a kinship matrix from a file. The first row of the kinship file has to be the sample IDs included in the kinship file. If a sample of interest is not included in the kinship file, fatal error will occur and the program will be terminated. A sample of interest is a sample that is phenotyped and has all covariates measured when --makeResiduals is specified.
  • --kinSave allows you to save the kinship matrix.

Kinship Options

  • --kinMiss and --kinMaf should be used with --kinGeno together.
  • --kinMiss specifies the maximum genotype missing rate when calculating kinship from genotypes. The default is 0.05.
  • --kinMaf specifies the minimum minor allele frequency used when calculating kinship from genotypes. The default is 0.05.

Chromosome X

  • --xLabel should have a value of a string which specifies how variants on chromosome X are coded. The default is "X".
  • --xStart and --xEnd specifies the start and end of non-pseudo-autosomal regions on chromosome X. These options should be specified when --vcX is used.
  • The default for --xStart is 2699520 and default for --xEnd is 154931044, according to NCBI genome build 37.

Please refer to the following for the analysis of X-linked variants ANALYZING CHROMOSOME X.

PhoneHome Parameters

See PhoneHome for more information on how PhoneHome works and what it does.

  • --noPhoneHome disables PhoneHome. PhoneHome is enabled by default based on the thinning parameter.
  • --phoneHomeThinning (0-100) adjusts the frequency of PhoneHome.
    • By default, --phoneHomeThinning is set to 50, running 50% of the time.
    • PhoneHome will only occur if the run's random number modulo 100 is less than the --phoneHomeThinning value.
    • N/A if --noPhoneHome is set.