Minimac3 Usage

From Genome Analysis Wiki
Jump to navigationJump to search

Introduction

Minimac3 is a lower memory and more computationally efficient implementation of minimac2. It is an algorithm for genotypic imputation that works on phased genotypes (say from MaCH) and is designed to handle very large reference panels in a more computationally efficient way with no loss of accuracy.

This wiki page is designed to give users a detailed explanation on Minimac3 Usage.

Command Line Options

A typical Minimac3 command line would have the following parameter options:

Command Line Options:
   Reference Haplotypes : --refHaps [], --passOnly
      Target Haplotypes : --haps []
      Output Parameters : --processReference, --prefix [Minimac3.Output],
                          --updateModel, --nobgzip, --doseOutput, --hapOutput,
                          --format [GT,DS]
      Subset Parameters : --chr [], --start, --end, --window
    Starting Parameters : --rec [], --err []
  Estimation Parameters : --rounds [5], --states [200]
       Other Parameters : --help, --cpus [1], --params
              PhoneHome : --noPhoneHome, --phoneHomeThinning [50]

Detailed Usage

The available options of Minimac3 are explained in detail below. See wiki page on Examples and Full list of Options for more details.

Reference Haplotypes

"--refHaps" denotes the main input reference file could either be a VCF file or M3VCF file. No handle is necessary for denoting type of file, program will detect it itself.

Minimac3 can handle both VCF files or M3VCF files as input for the reference panel. The program can itself identify the type of file, and no handle is necessary for that. M3VCF files are customized files created by Minimac3 (possibly in some previous run) that stores large reference panels in a compact form so as to save memory and computation time involved in reading large files. See wiki page on M3VCF files for further details. Users can download commonly used reference panels in both VCF and M3VCF format from Reference Panels.

Target Haplotypes

"--haps" denotes the main input GWAS file which has to be a VCF file (.vcf or .vcf.gz). The extensions are not mandatory.

Minimac3 can handle only VCF files as input for the target/gwas data. Note that input VCF files would be automatically assumed to be pre-phased. Markers which are in the target panel and NOT in the reference panel would be excluded from the output files. User must merge these extra markers back to the original data in order to analyze them.

Output Files

"--prefix" denotes the prefix for the output files (By default: Minimac3.Output)

Minimac3 can output files in both VCF format and .dose format (usual minimac output format). By default, Minimac3 will only output in VCF format and users must use the handle --doseOutput to output in .dose format or the handle --hapOutput to output dosage data in phased format. Output VCF files can store dosage data only in the following formats and in managed by the handle --format (by default : --format DS,GT) :

  • DS : Estimated alternate allele dosage (default).
  • GT : Estimated most likely genotype (default).
  • GP : Estimated posterior genotype probabilities (use handle --format GP).

The handle --processReference is used to ONLY convert reference panels from VCF format to M3VCF format (and save parameter estimates). NO imputation will be performed and thus NO target/gwas haplotypes are required. However, by default, parameter estimation will be done using the reference panel and the estimates will be saved in the M3VCF files. Users should use --rounds 0 in order to opt out of parameter estimation and only compress the reference panel and save it as a M3VCF file. See wiki page on Examples for further details.

[NOTE: While doing imputation, if parameter estimates are found in M3VCF files, Minimac3 will automatically use them for imputation. Users should use handle --updateModel in order to update the parameter estimates using the target/gwas panel as well. However, this is NOT necessary in most cases, unless the user has strong reasons to believe that this might increase the imputation accuracy.]

Subset Parameters

Starting Parameters

Estimation Parameters

Other Parameters

PhoneHome

hjl

Download

Minimac3 is available as an undocumented release version. The source files (and binary executable) are available for download in Source Files and commonly used reference panels in VCF and M3VCF formats are available for download in Reference Panels.

Contact

In case of any queries and bugs please contact Sayantan Das.