Minimac3 Cookbook : Chromosome X Imputation

From Genome Analysis Wiki
Jump to navigationJump to search

Introduction

Minimac3 is a lower memory and more computationally efficient implementation of minimac2. It is an algorithm for genotypic imputation that works on phased genotypes and is designed to handle very large reference panels in a more computationally efficient way with no loss of accuracy.

This wiki page is designed to give users a detailed step-by-step description on imputing chromosome X.

Chromosome X Imputation

Chromosome X has a pseudo-autosomal region (PAR) which can be imputed for males and females together. Imputing the PAR on chromosome X is same as usual imputation, since both males and females are diploids at these sites. However, the non pseudo-autosomal region (non-PAR) needs to be imputed for males and females separately, as males are haploids while females are diploids. Of course, the PAR and non-PAR regions need to be imputed separately. This wiki page gives further details on imputing chromosome X.


  • Convert files to VCF Format : Start by converting the unphased, quality controlled data set into VCF format. See our wiki page on .
  • Split the data by Sex : Start by splitting the unphased, quality controlled data set by sex.
  • Split the data into PAR and non-PAR: Separate the pseudo-autosomal part and non-pseudo-autosomal part into separate files. The PAR is located on chrX:1-2709520 and chrX:154584238-154913754 on build hg18 and chrX:60001-2699519 and chrX:154931044-155260560 on build hg19. The split can be done for VCF files as follows (for build hg19):
vcftools --gzvcf males.gwas.data.vcf.gz \
         --from-bp 2699520 \
         --to-bp 154931043 \
         --recode \
         --out males.non.PAR.gwas.data
 
vcftools --gzvcf males.gwas.data.vcf.gz \
         --exclude-positions males.non.PAR.gwas.data.recode.vcf \
         --recode \
         --out males.PAR.gwas.data
  • Impute Sex and PAR/non-PAR separately: The following example illustrates how to do that (files available in Minimac3/test/ directory)
# Male Samples (Non-PAR)
 ../bin/Minimac3 --refHaps refPanelChrX.Non.Auto.vcf \
                 --haps targetStudyChrX.males.vcf \
                 --prefix testRun.males.Non.PAR
 
# Female Samples (Non-PAR)
 ../bin/Minimac3 --refHaps refPanelChrX.Non.Auto.vcf \
                 --haps targetStudyChrX.females.vcf \
                 --prefix testRun.females.Non.PAR
 
# Male Samples (PAR)
 ../bin/Minimac3 --refHaps refPanelChrX.Non.Auto.vcf \
                 --haps targetStudyChrX.males.vcf \
                 --prefix testRun.males.PAR
 
# Female Samples (PAR)
 ../bin/Minimac3 --refHaps refPanelChrX.Non.Auto.vcf \
                 --haps targetStudyChrX.females.vcf \
                 --prefix testRun.females.PAR
  • NOTE: For imputing non-PAR of chromosome X, user must analyze male and female samples separately, otherwise program would crash. User should also ensure that the reference panel consists of only PAR or non-PAR region of chromosome X, otherwise program would crash.

Download

Minimac3 is currently available as a pre-release. The source files (and binary executable) are available for download in Source Files and commonly used reference panels in VCF and M3VCF formats are available for download in Reference Panels.

Useful Wiki Pages

There are a few pages in this Wiki that may be useful to for Minimac3 users. Here are links to a few:

Contact

In case of any queries and bugs please contact Sayantan Das.