Difference between revisions of "Minimac4"
Santy.8128 (talk | contribs) |
Santy.8128 (talk | contribs) |
||
Line 16: | Line 16: | ||
The input file format, output file formats and typical command lines are same in Minimac4 (as they were in minimac3). Some of the main new features are summarized below: | The input file format, output file formats and typical command lines are same in Minimac4 (as they were in minimac3). Some of the main new features are summarized below: | ||
− | * Minimac4 automatically chunks the whole chromosome (into overlapping chunks), analyzes each chunk and then concatenates the data back. This caps the memory usage across different chromosomes (large chromosomes need the same amount of memory as smaller ones). The length of the chunk and the overlap can be controlled by the parameters --chunkLengthMb and --chunkLengthOverlapMb | + | * '''Automated Chunking - ''' Minimac4 automatically chunks the whole chromosome (into overlapping chunks), analyzes each chunk and then concatenates the data back. This caps the memory usage across different chromosomes (large chromosomes need the same amount of memory as smaller ones). The length of the chunk and the overlap can be controlled by the parameters --chunkLengthMb and --chunkLengthOverlapMb |
* Minimac4 uses some approximations to speed up the imputation analyses. The levels of approximation can be controlled by the parameters. Higher the level of approximation will reduce the compute time but also marginally reduce the imputation accuracy. However, we recommend using the default values which negligibly reduce the accuracy while still speeding the average compute time. | * Minimac4 uses some approximations to speed up the imputation analyses. The levels of approximation can be controlled by the parameters. Higher the level of approximation will reduce the compute time but also marginally reduce the imputation accuracy. However, we recommend using the default values which negligibly reduce the accuracy while still speeding the average compute time. |
Revision as of 21:46, 29 June 2017
Introduction
Minimac4 is a latest version in the series of genotype imputation software - preceded by Minimac3 (2015), Minimac2 (2014), minimac (2012) and MaCH (2010). Minimac4 is a lower memory and more computationally efficient implementation of the original algorithms with negligible fall in imputation quality.
The Minimac3 mailing list has been renamed as the Minimac4 mailing list. If you were already a member, no need to re-join. If not, please join our mailing list to get updates about future releases or report possible bugs or email them to Sayantan Das.
Download
Minimac4 (version 1.0.2, updated 6.29.2017) is currently available for testing purposes only (while we still run more tests and wait on feedback about potential bugs). Commonly used reference panels in M3VCF format are available for download in Reference Panels.
Github Repo: : Minimac4 Github
What's New
The input file format, output file formats and typical command lines are same in Minimac4 (as they were in minimac3). Some of the main new features are summarized below:
- Automated Chunking - Minimac4 automatically chunks the whole chromosome (into overlapping chunks), analyzes each chunk and then concatenates the data back. This caps the memory usage across different chromosomes (large chromosomes need the same amount of memory as smaller ones). The length of the chunk and the overlap can be controlled by the parameters --chunkLengthMb and --chunkLengthOverlapMb
- Minimac4 uses some approximations to speed up the imputation analyses. The levels of approximation can be controlled by the parameters. Higher the level of approximation will reduce the compute time but also marginally reduce the imputation accuracy. However, we recommend using the default values which negligibly reduce the accuracy while still speeding the average compute time.
Reference Panels for Download
Some commonly used reference panels are available for download here:
Reference Panel | Number of Samples |
File Format | Parameter Estimates Available |
Chromosomes | Link |
---|---|---|---|---|---|
1000 Genomes Phase 3 |
2,504 | VCF | - | 1-22,X | Download |
M3VCF | YES | 1-22,X | Download | ||
NO | 1-22,X | Download | |||
VCF,M3VCF | YES | X | Download | ||
1000 Genomes Phase 1 |
1,092 | VCF | - | 1-22,X | Download |
M3VCF | YES | 1-22,X | Download | ||
NO | 1-22,X | Download | |||
VCF,M3VCF | YES | X | Download |