RAREMETALWORKER command reference
From Genome Analysis Wiki
Useful Links
Here are some useful links to key pages:
List of Options
Options: Input Files : --ped [], --dat [], --vcf [], --dosage, --noeof Output Files : --prefix [], --LDwindow [1000000], --zip, --thin, --labelHits VC Options : --vcX, --separateX Trait Options : --makeResiduals, --inverseNormal, --traitName [] Model Options : --recessive, --dominant Kinship Source : --kinPedigree, --kinGeno, --kinFile [], --kinxFile [], --kinSave Kinship Options : --kinMaf [0.05], --kinMiss [0.05] Chromosome X : --xLabel [X], --xStart [2699520], --xEnd [154931044], --maleLabel [1], --femaleLabel [2] PhoneHome : --noPhoneHome, --phoneHomeThinning [100]
Input Files
--ped
- --ped takes a string of your MERLIN format PED file name.
--dat
- --ped takes a string of your MERLIN format DAT file name.
--vcf
- --vcf takes a string of your VCF file name.
--dosage
- When --dosage is issued in command line, RAREMETALWORKER reads dosage from your VCF file.
- --dosage must be used with --vcf option.
- Description of dosage format in a VCF file can be found in dosage.
--noeof
- If you VCF file does not have the BGZF EOF markers, you should use --noeof option to let RAREMETALWORKER skip checking the BGZF EOF markers at the end of the file.
- Please see BGZF EOF for more details.
Output Files
--prefix
- --prefix takes a value of a string as the prefix of your output files.
- For a full list of output files generated by RAREMETALWORKER, please refer to output.
--LDwindow
- --LDwindow takes a integer value as the size of the moving window.
- RAREMETALWORKER generates LD matrices between a current marker that it is working on and all markers within this window.
- The default size is 1 million bases.
- For more information about the LD matrix, please refer to LD matrix.
--zip
- By issuing --zip, RAREMETALWORKER compress the summary statistics and LD matrices generated automatically, using gzip.
--thin
- If --thin is issued, then RAREMETALWORKER generates QQ plots and Manhattan plots with less resolution (points), to make the pdf files smaller in size.
--labelHits
- If --thin is issued, then RAREMETALWORKER automatically label the loci that are above a threshold.
- The threshold is calculated using Bonferroni correction (0.05/N, where N is the total number of polymorphic markers).
VC Options
--vcX
- --vcX option has to be used with --kinPedigree (when pedigree kinship is used), or --kinGeno (when genomic relationship matrix is estimated), or --kinFile ( when GRM is read from a file).
- Using --vcX option let RAREMETALWORKER fit a linear mixed model to analyze chromosome X, using both autosomal kinship and chromosome X kinship.
--separateX
- --separateX option must be used with --vcX option.
- Using --separateX option requests RAREMETALWORKER to fit a linear mixed model using only chromosome X kinship for analyses of chromosome X markers.
Please refer to method and technical details for more explanation.
Trait Options
--makeResiduals
- If --makeResiduals is used, then covariates are adjusted before fitting linear models using residuals.
--inverseNormal
- If --inverseNormal is used, but not with --makeResiduals, then trait values are inverse normalized before fitting linear models.
- If --inverseNormal and --makeResiduals are used together, then covariates are adjusted and inverse normalized residuals are used to fit linear models.
--traitName
- --traitName takes a string of the trait name that you want to analyze.
- If this option is not used, then all traits included in PED/DAT files are analyzed.
Model Options
Kinship Source
Kinship Options
Chromosome X
PhoneHome
- --prefix is optional.
- If --prefix is not specified, the output file names will be:
traitname.singlevar.score.txt traitname.singlevar.cov.txt
- Otherwise, the output file names are:
prefix.traitname.singlevar.score.txt prefix.traitname.singlevar.cov.txt
- --LDwindow specifies the length of the window that LD Matrix should be generated upon each variant. The default is 1MB.
- --zip gives users the option of writing compressed files (bgzip compressed) automatically for convenient sharing.
- --thin tells RMW to thin points when generating QQ plot and Manhattan plots, so the file size is smaller.
- --labelHits tells RMW to to label the hits using pvalue threshold 0.05/(#of variants tested) with gene name, based on human genome build 19.
VC Options
- When --vcShared and --vcX are specified, RMW knows that you want to fit shared environment and/or chromosome X variance component together with genetic component and non-shared environment.
- When --makeResiduals is specified, RMW understands covariates should be read from PED/DAT file. Covariates are modeled as fixed effects.
Trait Options
- --makeResiduals tells RMW to adjust the covariates and analyze residuals instead of the original phenotypes. If either --kinGeno or --kinPedigree option is used, then a variance component model will be fit based on residuals. If the --inverseNormal option is also used, then the residuals will be quantile normalized before fitting variance component model.
- --traitName is created for situations when you have many traits saved in your PED and DAT file, but you are interested in one or a few of them. It can read a file ending with .txt with each trait of interest in a separate line, or trait names separated with "/". An example to handle one trait or multiple traits is in the following:
--traitName LDL --traitName LDL/HDL/TG --traitName traitsOfInterest.txt
- If --traitName is not used, all traits in PED/DAT file will be analyzed.
Model Options
- additive model is used in RMW as default.
- --recessive allows additional association results (pvalue, effect size, and standard error) generated using recessive model. If VCF file is used, then non-reference allele is considered the recessive allele. If PED/DAT files are used for genotype, then minor allele is considered the recessive allele.
- --dominant allows additional association results (pvalue, effect size, and standard error) generated using dominant model. If VCF file is used, then non-reference allele is considered the dominant allele. If PED/DAT files are used for genotype, then minor allele is considered the dominant allele.
- --recessive and --dominant options can be used together.
- Recessive and dominant results are stored in separate files.
Kinship Source
- --kinPedigree allows RMW to generate kinship matrix from pedigree, when pedigree information is available.
- --kinGeno informs RMW to generate kinship matrix from all available variants that pass the criteria, specified in --kinMaf and --kinMiss options. The default will take variants with MAF>0.05 and genotype missing rate <0.05.
- --kinGeno option can NOT be used with --kinPedigree or --kinFile option. Only one of three options or none of them can be used in the same run.
- --kinFile let RMW read in a kinship matrix from a file. The first row of the kinship file has to be the sample IDs included in the kinship file. If a sample of interest is not included in the kinship file, fatal error will occur and the program will be terminated. A sample of interest is a sample that is phenotyped and has all covariates measured when --makeResiduals is specified.
- --kinSave allows you to save the kinship matrix.
Kinship Options
- --kinMiss and --kinMaf should be used with --kinGeno together.
- --kinMiss specifies the maximum genotype missing rate when calculating kinship from genotypes. The default is 0.05.
- --kinMaf specifies the minimum minor allele frequency used when calculating kinship from genotypes. The default is 0.05.
Chromosome X
- --xLabel should have a value of a string which specifies how variants on chromosome X are coded. The default is "X".
- --xStart and --xEnd specifies the start and end of non-pseudo-autosomal regions on chromosome X. These options should be specified when --vcX is used.
- The default for --xStart is 2699520 and default for --xEnd is 154931044, according to NCBI genome build 37.
Please refer to the following for the analysis of X-linked variants ANALYZING CHROMOSOME X.
PhoneHome Parameters
See PhoneHome for more information on how PhoneHome works and what it does.
--noPhoneHome
disables PhoneHome. PhoneHome is enabled by default based on the thinning parameter.
--phoneHomeThinning
(0-100) adjusts the frequency of PhoneHome.- By default,
--phoneHomeThinning
is set to 50, running 50% of the time. - PhoneHome will only occur if the run's random number modulo 100 is less than the --phoneHomeThinning value.
- N/A if
--noPhoneHome
is set.
- By default,