Changes

From Genome Analysis Wiki
Jump to navigationJump to search
204 bytes added ,  19:35, 11 July 2017
Line 88: Line 88:  
The command options for DosageConvertor are explained below.  
 
The command options for DosageConvertor are explained below.  
   −
*<code>--vcfDose</code> is a mandatory parameter indicating the minimac3/4 VCF dosage file to be converted
+
{| class="wikitable"  style="text-align:center"  border="1" cellpadding="2"
*<code>--info</code> is the info file generated by minimac3/4 at the same time as the VCF dosage file. This parameter is optional, but if NO info file is provided, the output MaCH info file will have missing columns
+
|- bgcolor="white"
*<code>--prefix</code> sets the prefix for output files (default value: <code>Converted.Dosage</code>)
+
! Option
*<code>--type</code> sets the output file format (available options: <code>plink</code> (default) or <code>mach</code>)
+
! Description
*<code>--tag</code> indicates whether to import imputed values from dosages (<code>DS</code>: default), genotype probabilities (<code>GP</code>), or hard genotype calls (<code>GT</code>) from the input VCF file
+
|-
*<code>--format</code> sets the format of the converted output file.
+
| <code>--vcfDose</code>
**If <code>--type plink</code> is used, <code>--format</code> can take values 1, 2, or 3. Each of these values correspond to the three different formats available for PLINK dosage files (details given [http://www.cog-genomics.org/plink/1.9/assoc#dosage here])
+
|
**If <code>--type mach</code> is used, <code>--format</code> can take values 1 or 2. Details are given in [[#Convert to MaCH Files| Convert to MaCH Files]]  
+
mandatory parameter indicating the minimac3/4 VCF dosage file to be converted
*<code>--buffer</code> sets the number of markers to import at a time (MaCH format only) (default value <code>10000</code>)
+
|-
*<code>--idDelimiter</code> indicates the delimiter character used to split '''VCF Sample ID''' into '''FID''' and '''IID''' for PLINK format
+
| <code>--info</code>
*<code>--allDiploid</code> indicates whether to assume all samples are diploid (necessary for chromosome X). If this option is active, the output PLINK <code>.fam</code> will NOT contain any sex information
+
|
*<code>--sexFile</code> indicates a file containing sample sex information, which requires two columns: the first column contains the sample names as found in the VCF file, and the second columns contains either M (for males) or F (for females)
+
the info file generated by minimac3/4 at the same time as the VCF dosage file  
*<code>--TrimAlleles</code> indicates whether to trim alleles and variants IDs to 100 characters. Since PLINK does not allow variant IDs longer than 16,000 characters, this option can be used if variant names are too long
+
 
 +
(This parameter is optional, but if NO info file is provided, the output MaCH info file will have missing columns.)
 +
|-
 +
| <code>--prefix</code>  
 +
|
 +
sets the prefix for output files (default value: <code>Converted.Dosage</code>)
 +
|-
 +
| <code>--type</code>
 +
|
 +
sets the output file format (available options: <code>plink</code> (default) or <code>mach</code>)
 +
|-
 +
| <code>--tag</code>
 +
|
 +
indicates whether to import imputed values from dosages (<code>DS</code>: default), genotype probabilities (<code>GP</code>), or hard genotype calls (<code>GT</code>) from the input VCF file
 +
|-
 +
| <code>--format</code>
 +
|
 +
sets the format of the converted output file:
 +
 
 +
*If <code>--type plink</code> is used, <code>--format</code> can take values 1, 2, or 3. Each of these values correspond to the three different formats available for PLINK dosage files (details given [http://www.cog-genomics.org/plink/1.9/assoc#dosage here])
 +
*If <code>--type mach</code> is used, <code>--format</code> can take values 1 or 2. Details are given in [[#Convert to MaCH Files| Convert to MaCH Files]]  
 +
|-
 +
| <code>--buffer</code>
 +
|
 +
sets the number of markers to import at a time (MaCH format only) (default value <code>10000</code>)
 +
|-
 +
| <code>--idDelimiter</code>
 +
|
 +
indicates the delimiter character used to split '''VCF Sample ID''' into '''FID''' and '''IID''' for PLINK format
 +
|-
 +
| <code>--allDiploid</code>
 +
|
 +
indicates whether to assume all samples are diploid (necessary for chromosome X). If this option is active, the output PLINK <code>.fam</code> will NOT contain any sex information
 +
|-
 +
| <code>--sexFile</code>
 +
|
 +
indicates a file containing sample sex information, which requires two columns: the first column contains the sample names as found in the VCF file, and the second columns contains either M (for males) or F (for females)
 +
|-
 +
| <code>--TrimAlleles</code>
 +
|
 +
indicates whether to trim alleles and variants IDs to 100 characters. Since PLINK does not allow variant IDs longer than 16,000 characters, this option can be used if variant names are too long
 +
|}
    
= Contact =
 
= Contact =
    
In case of any queries and bugs please contact [mailto:sayantan@umich.edu Sayantan Das].
 
In case of any queries and bugs please contact [mailto:sayantan@umich.edu Sayantan Das].
75

edits

Navigation menu