Line 83: |
Line 83: |
| ===INTERFACE=== | | ===INTERFACE=== |
| | | |
− | RAREMETALWORKER is a command line tool. Once you execute, you will see a full list of options printed on the screen. For detailed description of command options, please go to [[RAREMETALWORKER_command_reference | '''COMMAND REFERENCE''']] | + | RAREMETALWORKER is a command line tool. Once you execute, you will see a full list of options printed on the screen. |
| + | |
| + | For detailed description of command options, please go to [[RAREMETALWORKER_command_reference | '''COMMAND REFERENCE''']] |
| + | |
| + | |
| RAREMETALWORKER 0.4.8 -- A Forerunner of RareMetal | | RAREMETALWORKER 0.4.8 -- A Forerunner of RareMetal |
| (c) 2012-2014 Shuang Feng, Dajiang Liu, Goncalo Abecasis | | (c) 2012-2014 Shuang Feng, Dajiang Liu, Goncalo Abecasis |
Line 102: |
Line 106: |
| --maleLabel [1], --femaleLabel [2] | | --maleLabel [1], --femaleLabel [2] |
| PhoneHome : --noPhoneHome, --phoneHomeThinning [100] | | PhoneHome : --noPhoneHome, --phoneHomeThinning [100] |
| + | |
| | | |
| ===INPUT FILE FORMAT=== | | ===INPUT FILE FORMAT=== |
| | | |
− |
| |
− | ===OUTPUT FILE FORMAT===
| |
− |
| |
− |
| |
− | === Input Files ===
| |
| RMW needs the following files as input: PED and DAT file in Merlin format, '''AND/OR''' a VCF file. When genotypes are stored in PED and DAT file, the VCF file is not needed. However, even if genotypes are saved in a VCF file, PED and DAT files are still needed for carrying covariate and trait information. | | RMW needs the following files as input: PED and DAT file in Merlin format, '''AND/OR''' a VCF file. When genotypes are stored in PED and DAT file, the VCF file is not needed. However, even if genotypes are saved in a VCF file, PED and DAT files are still needed for carrying covariate and trait information. |
| | | |
Line 144: |
Line 144: |
| | | |
| | | |
− | ==== Input Files ====
| |
| * When genotypes are saved in a VCF file, PED and DAT files are used for specifying pedigree structure, covariate and trait information. An example command line might look like this: | | * When genotypes are saved in a VCF file, PED and DAT files are used for specifying pedigree structure, covariate and trait information. An example command line might look like this: |
| --ped input.ped --dat input.dat --vcf input.vcf.gz | | --ped input.ped --dat input.dat --vcf input.vcf.gz |
Line 157: |
Line 156: |
| * --noeof allows using VCF file without BGZF EOF markers. This is a very rare option to use. If your run is terminated with error message: "", then you might want to check out this option. | | * --noeof allows using VCF file without BGZF EOF markers. This is a very rare option to use. If your run is terminated with error message: "", then you might want to check out this option. |
| | | |
− | ==== Output Files ==== | + | ===OUTPUT FILE FORMAT=== |
− | * --prefix is optional.
| |
− | * If --prefix is not specified, the output file names will be:
| |
− | traitname.singlevar.score.txt
| |
− | traitname.singlevar.cov.txt
| |
− | * Otherwise, the output file names are:
| |
− | prefix.traitname.singlevar.score.txt
| |
− | prefix.traitname.singlevar.cov.txt
| |
− | * --LDwindow specifies the length of the window that LD Matrix should be generated upon each variant. The default is 1MB.
| |
− | * --zip gives users the option of writing compressed files (bgzip compressed) automatically for convenient sharing.
| |
− | * --thin tells RMW to thin points when generating QQ plot and Manhattan plots, so the file size is smaller.
| |
− | * --labelHits tells RMW to to label the hits using pvalue threshold 0.05/(#of variants tested) with gene name, based on human genome build 19.
| |
| | | |
− | ==== VC Options ====
| |
− | * When --vcShared and --vcX are specified, RMW knows that you want to fit shared environment and/or chromosome X variance component together with genetic component and non-shared environment.
| |
− | * When --makeResiduals is specified, RMW understands covariates should be read from PED/DAT file. Covariates are modeled as fixed effects.
| |
| | | |
− | ==== Trait Options ====
| |
− | * --makeResiduals tells RMW to adjust the covariates and analyze residuals instead of the original phenotypes. If either --kinGeno or --kinPedigree option is used, then a variance component model will be fit based on residuals. If the --inverseNormal option is also used, then the residuals will be quantile normalized before fitting variance component model.
| |
− | * --traitName is created for situations when you have many traits saved in your PED and DAT file, but you are interested in one or a few of them. It can read a file ending with .txt with each trait of interest in a separate line, or trait names separated with "/". An example to handle one trait or multiple traits is in the following:
| |
− | --traitName LDL
| |
− | --traitName LDL/HDL/TG
| |
− | --traitName traitsOfInterest.txt
| |
− | * If --traitName is not used, all traits in PED/DAT file will be analyzed.
| |
− |
| |
− | ==== Model Options ====
| |
− | * additive model is used in RMW as default.
| |
− | * --recessive allows additional association results (pvalue, effect size, and standard error) generated using recessive model. If VCF file is used, then non-reference allele is considered the recessive allele. If PED/DAT files are used for genotype, then minor allele is considered the recessive allele.
| |
− | * --dominant allows additional association results (pvalue, effect size, and standard error) generated using dominant model. If VCF file is used, then non-reference allele is considered the dominant allele. If PED/DAT files are used for genotype, then minor allele is considered the dominant allele.
| |
− | * --recessive and --dominant options can be used together.
| |
− | * Recessive and dominant results are stored in separate files.
| |
− |
| |
− | ==== Kinship Source ====
| |
− | * --kinPedigree allows RMW to generate kinship matrix from pedigree, when pedigree information is available.
| |
− | * --kinGeno informs RMW to generate kinship matrix from all available variants that pass the criteria, specified in --kinMaf and --kinMiss options. The default will take variants with MAF>0.05 and genotype missing rate <0.05.
| |
− | * --kinGeno option can NOT be used with --kinPedigree or --kinFile option. Only one of three options or none of them can be used in the same run.
| |
− | * --kinFile let RMW read in a kinship matrix from a file. The first row of the kinship file has to be the sample IDs included in the kinship file. If a sample of interest is not included in the kinship file, fatal error will occur and the program will be terminated. A sample of interest is a sample that is phenotyped and has all covariates measured when --makeResiduals is specified.
| |
− | * --kinSave allows you to save the kinship matrix.
| |
− |
| |
− | ==== Kinship Options ====
| |
− | * --kinMiss and --kinMaf should be used with --kinGeno together.
| |
− | * --kinMiss specifies the maximum genotype missing rate when calculating kinship from genotypes. The default is 0.05.
| |
− | * --kinMaf specifies the minimum minor allele frequency used when calculating kinship from genotypes. The default is 0.05.
| |
− |
| |
− | ==== Chromosome X ====
| |
− | * --xLabel should have a value of a string which specifies how variants on chromosome X are coded. The default is "X".
| |
− | * --xStart and --xEnd specifies the start and end of non-pseudo-autosomal regions on chromosome X. These options should be specified when --vcX is used.
| |
− | * The default for --xStart is 2699520 and default for --xEnd is 154931044, according to NCBI genome build 37.
| |
− |
| |
− | Please refer to the following for the analysis of X-linked variants [[RAREMETALWORKER_X|'''ANALYZING CHROMOSOME X''']].
| |
− |
| |
− | {{PhoneHomeParameters|hdr=====|bullet=1}}
| |
− |
| |
− | === Handling Unrelated Individuals ===
| |
− | * To let Rare-Metal-Worker handle unrelated individuals, we just have to code the individuals as unrelated in PED file, or each individual belongs to a unique family. Then Rare-Metal-Worker will take care of the rest.
| |
− | * However, when --kinGenotype is also used, Rare-Metal-Worker will consider them as related and generate kinship matrix from genotypes.
| |
− | * An example is shown as following (header is included for illustration purpose, not in real PED file):
| |
| | | |
− | famid pid fid mid sex age trait
| |
− | 1 1.1 0 0 1 10 -0.3
| |
− | 2 2.1 0 0 1 56 0.0
| |
− | 3 3.1 0 0 2 31 0.4
| |
− | 4 4.1 0 0 2 23 0.008
| |
− | 5 5.1 0 0 2 34 2.35
| |
| | | |
| == Output == | | == Output == |
Line 306: |
Line 245: |
| xEnd [154931044] | | xEnd [154931044] |
| | | |
| + | |
| + | === Handling Unrelated Individuals === |
| + | * To let Rare-Metal-Worker handle unrelated individuals, we just have to code the individuals as unrelated in PED file, or each individual belongs to a unique family. Then Rare-Metal-Worker will take care of the rest. |
| + | * However, when --kinGenotype is also used, Rare-Metal-Worker will consider them as related and generate kinship matrix from genotypes. |
| + | * An example is shown as following (header is included for illustration purpose, not in real PED file): |
| + | |
| + | famid pid fid mid sex age trait |
| + | 1 1.1 0 0 1 10 -0.3 |
| + | 2 2.1 0 0 1 56 0.0 |
| + | 3 3.1 0 0 2 31 0.4 |
| + | 4 4.1 0 0 2 23 0.008 |
| + | 5 5.1 0 0 2 34 2.35 |
| == Example Command Lines == | | == Example Command Lines == |
| | | |