Minimac3 Cookbook : Converting Files to VCF
Consequently, we would have the following steps.
Convert GWAS Panel Files into VCF
If pre-phased GWAS data is available in VCF format, users can skip this step. Otherwise, the following steps show how to convert other format files to VCF format.
- PLINK: Use PLINK2 (available here) as follows:
plink --bfile Gwas.Chr20.Phased.Output \ --recode vcf \ --out Gwas.Chr20.Phased.Output.VCF.format
- MaCH: Use Mach2VCF (coming soon) as follows:
mach2VCF --haps Gwas.Chr20.Phased.Output.hap \ --snps Gwas.Chr20.Phased.Output.snps \ --prefix Gwas.Chr20.Phased.Output.VCF.format
- SHAPEIT: Use SHAPEIT (available here) as follows:
shapeit -convert \ --input-haps Gwas.Chr20.Phased.Output \ --output-vcf Gwas.Chr20.Phased.Output.VCF.format.vcf
Download Reference Panel
Commonly used reference panels are 1000 Genomes Phase 3 (2,535 samples), 1000 Genomes Phase 1 (1,094 samples), HapMap2 (269 samples), Haplotype Reference Consortium (32,914 samples) etc. Users are advised to use either 1000 Genomes Phase 3 (available for download in Reference Panels ) or the Haplotype Reference Consortium (which due to data privacy issues cannot be shared publicly but can be used for imputation remotely on a server through a imputation server setup at University of Michigan). Reference panels for different versions of 1000 Genomes, in both VCF and M3VCF
format, are available for download in Reference Panels.
Impute Samples
The final step for imputation involves running Minimac3 to perform the imputation analysis. Now that we have the pre-phased GWAS panel (in VCF format) and the appropriate reference panel (in VCF or M3VCF
format), we can run Minimac3 as follows. In the following examples, the first one uses a VCF file for reference (that can be obtained as explained above) and the second example uses a M3VCF
file (that might have been downloaded from the links below or created on a previous run of Minimac3).
../bin/Minimac3 --refHaps ReferencePanel.Chr20.1000Genomes.vcf \ --haps Gwas.Chr20.Phased.Output.VCF.format.vcf \ --prefix Gwas.Chr20.Imputed.Output
../bin/Minimac3 --refHaps ReferencePanel.Chr20.1000Genomes.m3vcf \ --haps Gwas.Chr20.Phased.Output.VCF.format.vcf \ --prefix Gwas.Chr20.Imputed.Output