Minimac3

From Genome Analysis Wiki
Jump to navigationJump to search

Introduction

Minimac3 is a lower memory and more computationally efficient implementation of minimac2. It is an algorithm for genotypic imputation that works on phased genotypes (say from MaCH). minimac3 is designed to handle very large reference panels in a more computationally efficient way with no loss of accuracy. This algorithm analyzes only the unique sets of haplotypes in small genomic segments, thereby saving on time-complexity, computational memory but no loss in degree of accuracy.

Minimac3, apart from performing imputation, also creates M3VCF files (customized minimac3 VCF files) which are able to store reference panel information in a compact form, thus saving on memory and time required to read large datasets. User will have an option to use the binary code to either just convert VCF files to M3VCF files or to perform imputation as well. The code can also take a previously generated M3VCF file as input for the reference panel. M3VCF files can also store pre-calculated estimates of recombination fraction and error, which can be used for later runs of imputation. The latest version of Minimac3 also allows output in the form of VCF files for easier data manipulation in downstream analysis.

Download

Minimac3 is available as an undocumented release version. The source files are available for download here and commonly used reference panels in M3VCF format are available for download in Reference Panels. The authors would really appreciate if users would use it on their data set and let us know of possible bugs to be fixed.

  • To Download Minimac3
Description Download Link
Minimac3 Executable UNIX Users
Minimac3-omp Executable (for parallel computing) UNIX Users
Minimac3 Source Files UNIX Users

Usage

Users should follow the following steps to compile Minimac3 (if they downloaded the source files) or should skip them (if they downloaded the binary executable).

## EXTRACT MINIMAC3 AND COMPILE
 
tar -xzvf Minimac3.v1.0.0.tar.gz
cd Minimac3/
make

A typical Minimac3 command line for imputation is as follows

../bin/Minimac3 --refHaps refPanel.vcf \ 
                --haps targetStudy.vcf \
                --prefix testRun

Here refPanel.vcf is the reference panel used in VCF format (e.g. 1000 Genomes), targetStudy.vcf is the phased GWAS data in VCF format, and testRun is the prefix for the output files. Some commonly used reference panels are available for download in Reference Panels. See wiki page on Detailed Usage and Imputation Cookbook for further details on using Minimac3 for imputation analysis.

Users can always type the following for further support:

 /bin/Minimac3 --help

Reference Panels for Download

Some commonly used reference panels are available for download here. [NOTE: Chromosome X will be be available soon]

Reference Panel Format Download Link Internal CSG Copy Link
1000 Genomes Phase 3 VCF Files Coming Soon /net/fantasia/home/sayantan/DATABASE/1000G/PHASE_3/FOR_UPLOAD/G1K_P3/VCF_Files/
1000 Genomes Phase 3 M3VCF Files (With Parameter Estimates) Coming Soon /net/fantasia/home/sayantan/DATABASE/1000G/PHASE_3/FOR_UPLOAD/G1K_P3/M3VCF_Files_With_Estimates/
1000 Genomes Phase 3 M3VCF Files (Without Parameter Estimates) Coming Soon /net/fantasia/home/sayantan/DATABASE/1000G/PHASE_3/FOR_UPLOAD/G1K_P3/M3VCF_Files_No_Estimates/
1000 Genomes Phase 1 VCF Files Coming Soon /net/fantasia/home/sayantan/DATABASE/1000G/PHASE_1_V3/FOR_UPLOAD/G1K_P1/VCF_Files/
1000 Genomes Phase 1 M3VCF Files (With Parameter Estimates) Coming Soon /net/fantasia/home/sayantan/DATABASE/1000G/PHASE_1_V3/FOR_UPLOAD/G1K_P1/M3VCF_Files_With_Estimates/
1000 Genomes Phase 1 M3VCF Files (Without Parameter Estimates) Coming Soon /net/fantasia/home/sayantan/DATABASE/1000G/PHASE_1_V3/FOR_UPLOAD/G1K_P1/M3VCF_Files_No_Estimates/

Contact

In case of any queries and bugs please contact Sayantan Das.