Difference between revisions of "Minimac4 - Full List of Options"

From Genome Analysis Wiki
Jump to navigationJump to search
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
= Introduction =
 
= Introduction =
  
'''Minimac4 ''' is a latest version in the series of genotype imputation software - preceded by [[Minimac3|Minimac3]] (2015), [[Minimac2|Minimac2]] (2014), [[Minimac|minimac]] (2012) and [[MaCH|MaCH]] (2010). '''Minimac4''' is a lower memory and more computationally efficient implementation of the original algorithms with negligible fall in imputation quality.
+
'''[[Minimac4|Minimac4]]''' is a latest version in the series of genotype imputation software - preceded by [[Minimac3|Minimac3]] (2015), [[Minimac2|Minimac2]] (2014), [[Minimac|minimac]] (2012) and [[MaCH|MaCH]] (2010). '''Minimac4''' is a lower memory and more computationally efficient implementation of the original algorithms with negligible fall in imputation quality.
  
 
This wiki page gives users '''a full list of all the available options on Minimac'''.
 
This wiki page gives users '''a full list of all the available options on Minimac'''.
Line 7: Line 7:
 
= Full List of Options =
 
= Full List of Options =
  
The following table gives a brief description of all the parameters of '''Minimac4'''. New handles added in Minimac4 are highlighted in bold.
+
The following table gives a brief description of all the parameters of Minimac4. '''New handles added in Minimac4 are highlighted in bold'''.
  
 
Users should see the wiki-page on [[Minimac4 Usage | Minimac4 Usage and Documentation]] and [[Minimac4 Imputation Cookbook]] for further help on how to use these options.
 
Users should see the wiki-page on [[Minimac4 Usage | Minimac4 Usage and Documentation]] and [[Minimac4 Imputation Cookbook]] for further help on how to use these options.
Line 35: Line 35:
 
| Prefix for all output files generated. By default: <font face=Courier>[Minimac3.Output]</font>
 
| Prefix for all output files generated. By default: <font face=Courier>[Minimac3.Output]</font>
 
|-   
 
|-   
| <font face=Courier>--<s>updateModel</s></font>
+
| <font face=Courier>'''--<s>updateModel</s>'''</font>
 
| This parameter has been disabled in Minimac4
 
| This parameter has been disabled in Minimac4
 
|-
 
|-
Line 41: Line 41:
 
| If ON, output files will be NOT bgzipped.  
 
| If ON, output files will be NOT bgzipped.  
 
|-
 
|-
| <font face=Courier>--vcfBuffer 200</font>
+
| <font face=Courier>--'''vcfBuffer 200'''</font>
 
| This is number of samples to be stored in the memory before writing a VCF file piece
 
| This is number of samples to be stored in the memory before writing a VCF file piece
 
|-
 
|-
 
| <font face=Courier>--vcfOutput</font>
 
| <font face=Courier>--vcfOutput</font>
| If ON, imputed data will NOT be output as VCF output file [Default: ON].
+
| If ON, imputed data will NOT be output as VCF output file [Default: ON]  
 
|-
 
|-
 
| <font face=Courier>--doseOutput</font>
 
| <font face=Courier>--doseOutput</font>
| If ON, imputed data will be output as dosage file as well [Default: OFF].
+
| If ON, imputed data will be output as dosage file as well [Default: OFF]
 
|-
 
|-
| <font face=Courier>--<s>hapOutput</s></font>
+
| <font face=Courier>'''--<s>hapOutput</s>'''</font>
 
| This parameter has been disabled in Minimac4
 
| This parameter has been disabled in Minimac4
 
|-
 
|-
 
| <font face=Courier>--format</font>
 
| <font face=Courier>--format</font>
| Specifies which fields to output for the FORMAT field in output VCF file. Available handles: <font face=Courier>GT,DS,GP,HDS </font>. <font face=Courier>GT</font> stands for Genotype, <font face=Courier>DS</font> stands for genotype dosage, <font face=Courier>HDS</font> stands for haplotype dosage and <font face=Courier>GP</font> stands for genotype probabilities [Default: <font face=Courier>GT,DS</font>]
+
| Specifies which fields to output for the FORMAT field in output VCF file. Available handles: <font face=Courier>GT,DS,GP,HDS</font>. <font face=Courier>GT</font> stands for Genotype, <font face=Courier>DS</font> stands for genotype dosage, <font face=Courier>HDS</font> stands for haplotype dosage and <font face=Courier>GP</font> stands for genotype probabilities [Default: <font face=Courier>GT,DS</font>]
 
|-
 
|-
 
| <font face=Courier>--allTypedSites</font>
 
| <font face=Courier>--allTypedSites</font>
| Also Includes variants that were genotyped but NOT in the reference panel in the output files (and imputes any missing data in such variants to the major allele frequency).
+
| Also Includes variants that were genotyped but NOT in the reference panel in the output files (and imputes any missing data in such variants to the major allele frequency). [Default: OFF]
 +
|-
 +
| <font face=Courier>'''--meta'''</font>
 +
| If this handle is ON, Minimac4 also outputs some diagnostic measures required for meta-imputation by MetaMinimac. Please turn this handle ON if you plan to use MetaMinimac later for meta-imputation of dosages from Minimac4. [Default: OFF]
 +
|-
 +
| <font face=Courier>'''--memUsage'''</font>
 +
| If this handle is ON, Minimac4 will NOT run the imputation, but instead report an estimated memory usage summary. It will also report some minor instructions on how to change the memory usage by tweaking parameters in the command line. This summary might enable users to get an idea of memory consumption and modify it, if need be, before starting the imputation experiment.  [Default: OFF]
 
|-
 
|-
 
| <font face=Courier>--chr 22</font>
 
| <font face=Courier>--chr 22</font>
Line 68: Line 74:
 
| End position for imputation by chunking. Would not work without <font face=Courier>--chr</font> option.
 
| End position for imputation by chunking. Would not work without <font face=Courier>--chr</font> option.
 
|-
 
|-
| <font face=Courier>--window 5000</font>
+
| <font face=Courier>--window 5000</font>
 
| Length of buffer region (in bp units) on either side of <font face=Courier>--start</font> and <font face=Courier>--end</font>. By default = 500000 (if chunking is done) and = 0 (if no chunking is being done).
 
| Length of buffer region (in bp units) on either side of <font face=Courier>--start</font> and <font face=Courier>--end</font>. By default = 500000 (if chunking is done) and = 0 (if no chunking is being done).
 
|-  
 
|-  
 +
| <font face=Courier>'''--ChunkLengthMb 20.00'''</font>
 +
| Minimac4 runs imputation on automated chunks. This parameters specifies the length of each chunk in Mbp.
 +
|-
 +
| <font face=Courier>'''--ChunkOverlapMb 3.00'''</font>
 +
|  This parameters specifies the length of the buffer region to be analyzed on each side of the chunk (in Mbp). Thus, if a user inputs <font face=Courier>--ChunkLengthMb 20.00</font> and <font face=Courier>--ChunkOverlapMb 3.00</font>, Minimac4 would analyze 26 Mbp chunks at a time.
 +
|-
 
| <font face=Courier>--rec</font>
 
| <font face=Courier>--rec</font>
 
| Recombination File from previous run of Minimac/Minimac3. (<font face=Courier>--err</font> parameter must also be provided, if using this handle)
 
| Recombination File from previous run of Minimac/Minimac3. (<font face=Courier>--err</font> parameter must also be provided, if using this handle)
Line 82: Line 94:
 
| <font face=Courier>--states 200</font>
 
| <font face=Courier>--states 200</font>
 
| Maximum number of reference (or target) haplotypes to be examined during parameter optimization. By default = 200.
 
| Maximum number of reference (or target) haplotypes to be examined during parameter optimization. By default = 200.
 +
|-
 +
| <font face=Courier>'''--probThreshold 0.01'''</font>
 +
| This parameter specifies the minimum posterior probability for a reference haplotype group to be included. If for some reason, the user believes that Minimac4 dropped the imputation accuracy significantly (compared to Minimac4), they should decrease the value to 0.0 (to be on the safe side). In general, reducing this value is not recommended as it will increase the compute time with no gain in accuracy
 
|-  
 
|-  
 
| <font face=Courier>--help</font>
 
| <font face=Courier>--help</font>
| A short help on options.
+
| A short help on options
 
|-  
 
|-  
| <font face=Courier>--lowMemory</font>
+
| <font face=Courier>'''--<s>lowMemory'''</s></font>
| If ON, a low memory version of Minimac3 will be run.
+
| This handles has been disabled in Minimac4
 
|-  
 
|-  
 
| <font face=Courier>--cpus 5</font>
 
| <font face=Courier>--cpus 5</font>
Line 101: Line 116:
 
= Download =
 
= Download =
  
'''Minimac3 ''' is available as an undocumented release version. The source files (and binary executable) are available for download in  [[Minimac3#Download | Source Files]] and commonly used reference panels in VCF and <font face=Courier>M3VCF</font> formats are available for download in [[Minimac3#Reference Panels for Download | Reference Panels]].
+
'''Minimac4 ''' is available for testing purposes only. The source files (and binary executable) are available for download in  [[Minimac4#Download | Source Files]] and commonly used reference panels in VCF and <font face=Courier>M3VCF</font> formats are available for download in [[Minimac4#Reference Panels for Download | Reference Panels]].
 
 
= Useful Wiki Pages =
 
 
 
There are a few pages in this Wiki that may be useful to for '''Minimac3''' users. Here are links to a few:
 
 
 
* [[Minimac3| Minimac3 Overview Page]]
 
 
 
* [[Minimac3 Usage | Minimac3 Usage and Documentation]]
 
 
 
* [[Minimac3 Imputation Cookbook]] ('''Recommended for New Users!!''')
 
 
 
* [[Minimac3 Examples| Minimac3 Examples]]
 
 
 
* [[M3VCF Files| M3VCF Files]]
 
  
 
= Contact =
 
= Contact =
  
 
In case of any queries and bugs please contact [mailto:sayantan@umich.edu Sayantan Das].
 
In case of any queries and bugs please contact [mailto:sayantan@umich.edu Sayantan Das].

Latest revision as of 21:28, 1 December 2016

Introduction

Minimac4 is a latest version in the series of genotype imputation software - preceded by Minimac3 (2015), Minimac2 (2014), minimac (2012) and MaCH (2010). Minimac4 is a lower memory and more computationally efficient implementation of the original algorithms with negligible fall in imputation quality.

This wiki page gives users a full list of all the available options on Minimac.

Full List of Options

The following table gives a brief description of all the parameters of Minimac4. New handles added in Minimac4 are highlighted in bold.

Users should see the wiki-page on Minimac4 Usage and Documentation and Minimac4 Imputation Cookbook for further help on how to use these options.


Parameter Description
--refHaps filename VCF file or M3VCF file containing haplotype data for reference panel.
--rsid This option only imports RS ID of variants from ID column of reference file (if available).
--passOnly If ON, only variants will FILTER=PASS will be recorded from reference VCF file (does NOT work on M3VCF files yet).
--haps filename File containing haplotype data for target (gwas) samples. Must be a VCF file.
--processReference This option will only convert an input VCF file to M3VCF format (maybe for a later run of imputation). If this option is ON, no imputation would be performed and thus all other parameters will be ignored (of course, except for parameters on Reference Haplotypes and Subsetting Options). This option also does parameter estimation using the reference panel and saves them in the M3VCF file (the estimation can be skipped with rounds = 0)
--prefix output Prefix for all output files generated. By default: [Minimac3.Output]
--updateModel This parameter has been disabled in Minimac4
--nobgzip If ON, output files will be NOT bgzipped.
--vcfBuffer 200 This is number of samples to be stored in the memory before writing a VCF file piece
--vcfOutput If ON, imputed data will NOT be output as VCF output file [Default: ON]
--doseOutput If ON, imputed data will be output as dosage file as well [Default: OFF]
--hapOutput This parameter has been disabled in Minimac4
--format Specifies which fields to output for the FORMAT field in output VCF file. Available handles: GT,DS,GP,HDS. GT stands for Genotype, DS stands for genotype dosage, HDS stands for haplotype dosage and GP stands for genotype probabilities [Default: GT,DS]
--allTypedSites Also Includes variants that were genotyped but NOT in the reference panel in the output files (and imputes any missing data in such variants to the major allele frequency). [Default: OFF]
--meta If this handle is ON, Minimac4 also outputs some diagnostic measures required for meta-imputation by MetaMinimac. Please turn this handle ON if you plan to use MetaMinimac later for meta-imputation of dosages from Minimac4. [Default: OFF]
--memUsage If this handle is ON, Minimac4 will NOT run the imputation, but instead report an estimated memory usage summary. It will also report some minor instructions on how to change the memory usage by tweaking parameters in the command line. This summary might enable users to get an idea of memory consumption and modify it, if need be, before starting the imputation experiment. [Default: OFF]
--chr 22 Chromosome number for which we will carry out imputation.
--start 100000 Start position for imputation by chunking. Would not work without --chr option.
--end 200000 End position for imputation by chunking. Would not work without --chr option.
--window 5000 Length of buffer region (in bp units) on either side of --start and --end. By default = 500000 (if chunking is done) and = 0 (if no chunking is being done).
--ChunkLengthMb 20.00 Minimac4 runs imputation on automated chunks. This parameters specifies the length of each chunk in Mbp.
--ChunkOverlapMb 3.00 This parameters specifies the length of the buffer region to be analyzed on each side of the chunk (in Mbp). Thus, if a user inputs --ChunkLengthMb 20.00 and --ChunkOverlapMb 3.00, Minimac4 would analyze 26 Mbp chunks at a time.
--rec Recombination File from previous run of Minimac/Minimac3. (--err parameter must also be provided, if using this handle)
--err Error File from previous run of Minimac/Minimac3. (--rec parameter must also be provided, if using this handle)
--rounds 5 Rounds of optimization for model parameters, which describe population recombination rates and per SNP error rates. By default = 5.
--states 200 Maximum number of reference (or target) haplotypes to be examined during parameter optimization. By default = 200.
--probThreshold 0.01 This parameter specifies the minimum posterior probability for a reference haplotype group to be included. If for some reason, the user believes that Minimac4 dropped the imputation accuracy significantly (compared to Minimac4), they should decrease the value to 0.0 (to be on the safe side). In general, reducing this value is not recommended as it will increase the compute time with no gain in accuracy
--help A short help on options
--lowMemory This handles has been disabled in Minimac4
--cpus 5 Number of cpus for parallel computing. Would work only with Minimac3-omp.
--noPhoneHome If ON, code will NOT send a SUCCESS/FAILURE status of the execution to home server.
--phoneHomeThinning 50 Percentage probability of sending SUCCESS/FAILURE status of the execution to home server [Default: 50%]

Download

Minimac4 is available for testing purposes only. The source files (and binary executable) are available for download in Source Files and commonly used reference panels in VCF and M3VCF formats are available for download in Reference Panels.

Contact

In case of any queries and bugs please contact Sayantan Das.