Minimac4 - Full List of Options

From Genome Analysis Wiki
Jump to: navigation, search

Introduction

Minimac4 is a latest version in the series of genotype imputation software - preceded by Minimac3 (2015), Minimac2 (2014), minimac (2012) and MaCH (2010). Minimac4 is a lower memory and more computationally efficient implementation of the original algorithms with negligible fall in imputation quality.

This wiki page gives users a full list of all the available options on Minimac.

Full List of Options

The following table gives a brief description of all the parameters of Minimac4. New handles added in Minimac4 are highlighted in bold.

Users should see the wiki-page on Minimac4 Usage and Documentation and Minimac4 Imputation Cookbook for further help on how to use these options.


Parameter Description
--refHaps filename VCF file or M3VCF file containing haplotype data for reference panel.
--rsid This option only imports RS ID of variants from ID column of reference file (if available).
--passOnly If ON, only variants will FILTER=PASS will be recorded from reference VCF file (does NOT work on M3VCF files yet).
--haps filename File containing haplotype data for target (gwas) samples. Must be a VCF file.
--processReference This option will only convert an input VCF file to M3VCF format (maybe for a later run of imputation). If this option is ON, no imputation would be performed and thus all other parameters will be ignored (of course, except for parameters on Reference Haplotypes and Subsetting Options). This option also does parameter estimation using the reference panel and saves them in the M3VCF file (the estimation can be skipped with rounds = 0)
--prefix output Prefix for all output files generated. By default: [Minimac3.Output]
--updateModel This parameter has been disabled in Minimac4
--nobgzip If ON, output files will be NOT bgzipped.
--vcfBuffer 200 This is number of samples to be stored in the memory before writing a VCF file piece
--vcfOutput If ON, imputed data will NOT be output as VCF output file [Default: ON]
--doseOutput If ON, imputed data will be output as dosage file as well [Default: OFF]
--hapOutput This parameter has been disabled in Minimac4
--format Specifies which fields to output for the FORMAT field in output VCF file. Available handles: GT,DS,GP,HDS. GT stands for Genotype, DS stands for genotype dosage, HDS stands for haplotype dosage and GP stands for genotype probabilities [Default: GT,DS]
--allTypedSites Also Includes variants that were genotyped but NOT in the reference panel in the output files (and imputes any missing data in such variants to the major allele frequency). [Default: OFF]
--meta If this handle is ON, Minimac4 also outputs some diagnostic measures required for meta-imputation by MetaMinimac. Please turn this handle ON if you plan to use MetaMinimac later for meta-imputation of dosages from Minimac4. [Default: OFF]
--memUsage If this handle is ON, Minimac4 will NOT run the imputation, but instead report an estimated memory usage summary. It will also report some minor instructions on how to change the memory usage by tweaking parameters in the command line. This summary might enable users to get an idea of memory consumption and modify it, if need be, before starting the imputation experiment. [Default: OFF]
--chr 22 Chromosome number for which we will carry out imputation.
--start 100000 Start position for imputation by chunking. Would not work without --chr option.
--end 200000 End position for imputation by chunking. Would not work without --chr option.
--window 5000 Length of buffer region (in bp units) on either side of --start and --end. By default = 500000 (if chunking is done) and = 0 (if no chunking is being done).
--ChunkLengthMb 20.00 Minimac4 runs imputation on automated chunks. This parameters specifies the length of each chunk in Mbp.
--ChunkOverlapMb 3.00 This parameters specifies the length of the buffer region to be analyzed on each side of the chunk (in Mbp). Thus, if a user inputs --ChunkLengthMb 20.00 and --ChunkOverlapMb 3.00, Minimac4 would analyze 26 Mbp chunks at a time.
--rec Recombination File from previous run of Minimac/Minimac3. (--err parameter must also be provided, if using this handle)
--err Error File from previous run of Minimac/Minimac3. (--rec parameter must also be provided, if using this handle)
--rounds 5 Rounds of optimization for model parameters, which describe population recombination rates and per SNP error rates. By default = 5.
--states 200 Maximum number of reference (or target) haplotypes to be examined during parameter optimization. By default = 200.
--probThreshold 0.01 This parameter specifies the minimum posterior probability for a reference haplotype group to be included. If for some reason, the user believes that Minimac4 dropped the imputation accuracy significantly (compared to Minimac4), they should decrease the value to 0.0 (to be on the safe side). In general, reducing this value is not recommended as it will increase the compute time with no gain in accuracy
--help A short help on options
--lowMemory This handles has been disabled in Minimac4
--cpus 5 Number of cpus for parallel computing. Would work only with Minimac3-omp.
--noPhoneHome If ON, code will NOT send a SUCCESS/FAILURE status of the execution to home server.
--phoneHomeThinning 50 Percentage probability of sending SUCCESS/FAILURE status of the execution to home server [Default: 50%]

Download

Minimac4 is available for testing purposes only. The source files (and binary executable) are available for download in Source Files and commonly used reference panels in VCF and M3VCF formats are available for download in Reference Panels.

Contact

In case of any queries and bugs please contact Sayantan Das.