Changes

From Genome Analysis Wiki
Jump to navigationJump to search
378 bytes removed ,  18:54, 11 July 2017
Line 93: Line 93:  
*<code>--type</code> denotes the output file format (available handles: <code>plink</code> (default) and <code>mach</code>).  
 
*<code>--type</code> denotes the output file format (available handles: <code>plink</code> (default) and <code>mach</code>).  
 
*<code>--tag</code> decides whether to import imputed values from dosage (<code>DS</code>: default), or genotype probabilities (<code>GP</code>), or hard call genotypes (<code>GT</code>) of the input VCF file.
 
*<code>--tag</code> decides whether to import imputed values from dosage (<code>DS</code>: default), or genotype probabilities (<code>GP</code>), or hard call genotypes (<code>GT</code>) of the input VCF file.
*<code>--format</code> decides the format of the output file. If <code>--type mach</code> is used, <code>--format</code> can take values 1, 2 and 3. Each of these values correspond to the three different formats available for PLINK dosage files (details given [http://www.cog-genomics.org/plink/1.9/assoc#dosage here]). If <code>--type mach</code> is used, <code>--format</code> can only take values 1 and 2. Details are given in [[#Convert to MaCH Files]]  
+
*<code>--format</code> decides the format of the output file. If <code>--type mach</code> is used, <code>--format</code> can take values 1, 2 and 3. Each of these values correspond to the three different formats available for PLINK dosage files (details given [http://www.cog-genomics.org/plink/1.9/assoc#dosage here]). If <code>--type mach</code> is used, <code>--format</code> can only take values 1 and 2. Details are given in [[#Convert to MaCH Files| Convert to MaCH Files]]  
 
   
*<code>--buffer</code> denotes the number of markers to import at a time (valid only for MaCH format) (default value <code>10000</code>).  
 
*<code>--buffer</code> denotes the number of markers to import at a time (valid only for MaCH format) (default value <code>10000</code>).  
 
*<code>--idDelimiter</code> denotes the delimiter to Split VCF Sample ID into FID and IID for PLINK format (default value <code>_</code>).
 
*<code>--idDelimiter</code> denotes the delimiter to Split VCF Sample ID into FID and IID for PLINK format (default value <code>_</code>).
 
+
*<code>--allDiploid</code> denotes whether to assume all samples are diploids. If this handle is on, the output PLINK <code>.fam</code> will NOT contain any sex information.
Usage: ./DosageConvertor  --vcfDose      TestDataImputedVCF.dose.vcf.gz
+
*<code>--sexFile</code> denotes the sex file which should have two columns: the first column has the sample names as found in the VCF file, and the second columns has M (for males) or F (for females).
                          --info        TestDataImputedVCF.info
+
*<code>--TrimAlleles</code> denotes whether to trim length of alleles and variants IDs since PLINK does NOT allow really long character sequences.
                          --prefix      OutputFilePrefix
  −
                          --type        plink OR mach  // depending on output format
  −
                          --format      DS or GP        // based on if you want to output
  −
                                                          // dosage (DS) or genotype prob (GP)
  −
                          --buffer      10000          // Number of Markers to import and
  −
                                                          // print at a time (valid only for
  −
                                                          // MaCH format)
  −
                          --idDelimiter  _              // Delimiter to Split VCF Sample ID into
  −
                                                          // FID and IID for PLINK format
      
= Contact =
 
= Contact =
    
In case of any queries and bugs please contact [mailto:sayantan@umich.edu Sayantan Das].
 
In case of any queries and bugs please contact [mailto:sayantan@umich.edu Sayantan Das].
487

edits

Navigation menu