Difference between revisions of "RAREMETALWORKER command reference"
From Genome Analysis Wiki
|Line 61:||Line 61:|
Revision as of 12:47, 14 April 2014
- 1 Useful Links
- 2 List of Options
- 3 Input Files
- 4 Output Files
- 5 VC Options
- 6 Trait Options
- 7 Model Options
- 8 Kinship Source
- 9 Kinship Options
- 10 Chromosome X
- 11 PhoneHome
Here are some useful links to key pages:
List of Options
Options: Input Files : --ped , --dat , --vcf , --dosage, --noeof Output Files : --prefix , --LDwindow , --zip, --thin, --labelHits VC Options : --vcX, --separateX Trait Options : --makeResiduals, --inverseNormal, --traitName  Model Options : --recessive, --dominant Kinship Source : --kinPedigree, --kinGeno, --kinFile , --kinxFile , --kinSave Kinship Options : --kinMaf [0.05], --kinMiss [0.05] Chromosome X : --xLabel [X], --xStart , --xEnd , --maleLabel , --femaleLabel  PhoneHome : --noPhoneHome, --phoneHomeThinning 
- --ped takes a string of your MERLIN format PED file name.
- --ped takes a string of your MERLIN format DAT file name.
- --vcf takes a string of your VCF file name.
- When --dosage is issued in command line, RAREMETALWORKER reads dosage from your VCF file.
- --dosage must be used with --vcf option.
- Description of dosage format in a VCF file can be found in dosage.
- If you VCF file does not have the BGZF EOF markers, you should use --noeof option to let RAREMETALWORKER skip checking the BGZF EOF markers at the end of the file.
- Please see BGZF EOF for more details.
- --prefix takes a value of a string as the prefix of your output files.
- For a full list of output files generated by RAREMETALWORKER, please refer to output.
- --LDwindow takes a integer value as the size of the moving window.
- RAREMETALWORKER generates LD matrices between a current marker that it is working on and all markers within this window.
- The default size is 1 million bases.
- For more information about the LD matrix, please refer to LD matrix.
- By issuing --zip, RAREMETALWORKER compress the summary statistics and LD matrices generated automatically, using gzip.
- If --thin is issued, then RAREMETALWORKER generates QQ plots and Manhattan plots with less resolution (points), to make the pdf files smaller in size.
- If --thin is issued, then RAREMETALWORKER automatically label the loci that are above a threshold.
- The threshold is calculated using Bonferroni correction (0.05/N, where N is the total number of polymorphic markers).
- --vcX option has to be used with --kinPedigree (when pedigree kinship is used), or --kinGeno (when genomic relationship matrix is estimated), or --kinFile ( when GRM is read from a file).
- Using --vcX option let RAREMETALWORKER fit a linear mixed model to analyze chromosome X, using both autosomal kinship and chromosome X kinship.
- --separateX option must be used with --vcX option.
- Using --separateX option requests RAREMETALWORKER to fit a linear mixed model using only chromosome X kinship for analyses of chromosome X markers.
- --prefix is optional.
- If --prefix is not specified, the output file names will be:
- Otherwise, the output file names are:
- --LDwindow specifies the length of the window that LD Matrix should be generated upon each variant. The default is 1MB.
- --zip gives users the option of writing compressed files (bgzip compressed) automatically for convenient sharing.
- --thin tells RMW to thin points when generating QQ plot and Manhattan plots, so the file size is smaller.
- --labelHits tells RMW to to label the hits using pvalue threshold 0.05/(#of variants tested) with gene name, based on human genome build 19.
- When --vcShared and --vcX are specified, RMW knows that you want to fit shared environment and/or chromosome X variance component together with genetic component and non-shared environment.
- When --makeResiduals is specified, RMW understands covariates should be read from PED/DAT file. Covariates are modeled as fixed effects.
- --makeResiduals tells RMW to adjust the covariates and analyze residuals instead of the original phenotypes. If either --kinGeno or --kinPedigree option is used, then a variance component model will be fit based on residuals. If the --inverseNormal option is also used, then the residuals will be quantile normalized before fitting variance component model.
- --traitName is created for situations when you have many traits saved in your PED and DAT file, but you are interested in one or a few of them. It can read a file ending with .txt with each trait of interest in a separate line, or trait names separated with "/". An example to handle one trait or multiple traits is in the following:
--traitName LDL --traitName LDL/HDL/TG --traitName traitsOfInterest.txt
- If --traitName is not used, all traits in PED/DAT file will be analyzed.
- additive model is used in RMW as default.
- --recessive allows additional association results (pvalue, effect size, and standard error) generated using recessive model. If VCF file is used, then non-reference allele is considered the recessive allele. If PED/DAT files are used for genotype, then minor allele is considered the recessive allele.
- --dominant allows additional association results (pvalue, effect size, and standard error) generated using dominant model. If VCF file is used, then non-reference allele is considered the dominant allele. If PED/DAT files are used for genotype, then minor allele is considered the dominant allele.
- --recessive and --dominant options can be used together.
- Recessive and dominant results are stored in separate files.
- --kinPedigree allows RMW to generate kinship matrix from pedigree, when pedigree information is available.
- --kinGeno informs RMW to generate kinship matrix from all available variants that pass the criteria, specified in --kinMaf and --kinMiss options. The default will take variants with MAF>0.05 and genotype missing rate <0.05.
- --kinGeno option can NOT be used with --kinPedigree or --kinFile option. Only one of three options or none of them can be used in the same run.
- --kinFile let RMW read in a kinship matrix from a file. The first row of the kinship file has to be the sample IDs included in the kinship file. If a sample of interest is not included in the kinship file, fatal error will occur and the program will be terminated. A sample of interest is a sample that is phenotyped and has all covariates measured when --makeResiduals is specified.
- --kinSave allows you to save the kinship matrix.
- --kinMiss and --kinMaf should be used with --kinGeno together.
- --kinMiss specifies the maximum genotype missing rate when calculating kinship from genotypes. The default is 0.05.
- --kinMaf specifies the minimum minor allele frequency used when calculating kinship from genotypes. The default is 0.05.
- --xLabel should have a value of a string which specifies how variants on chromosome X are coded. The default is "X".
- --xStart and --xEnd specifies the start and end of non-pseudo-autosomal regions on chromosome X. These options should be specified when --vcX is used.
- The default for --xStart is 2699520 and default for --xEnd is 154931044, according to NCBI genome build 37.
Please refer to the following for the analysis of X-linked variants ANALYZING CHROMOSOME X.
See PhoneHome for more information on how PhoneHome works and what it does.
--noPhoneHomedisables PhoneHome. PhoneHome is enabled by default based on the thinning parameter.
--phoneHomeThinning(0-100) adjusts the frequency of PhoneHome.
- By default,
--phoneHomeThinningis set to 50, running 50% of the time.
- PhoneHome will only occur if the run's random number modulo 100 is less than the --phoneHomeThinning value.
- N/A if
- By default,