Difference between revisions of "Minimac3 Cookbook : Converting Files to VCF"

From Genome Analysis Wiki
Jump to navigationJump to search
(Created page with " Consequently, we would have the following steps. ===Convert GWAS Panel Files into VCF === If pre-phased GWAS data is available in VCF format, users can skip this step. Othe...")
 
Line 1: Line 1:
 +
= Introduction =
  
Consequently, we would have the following steps.
+
After the pre-phasing has been done, we can begin to run the imputation. But before that, we need to convert our phased GWAS panel files (obtained above) to VCF format (since Minimac3 can only use VCF format files). If pre-phased GWAS data is available in VCF format, users can skip this step. Otherwise, the following steps show how to convert other format files to VCF format.
  
===Convert GWAS Panel Files into VCF ===
+
= Convert '''PLINK''' Files =  
  
If pre-phased GWAS data is available in VCF format, users can skip this step. Otherwise, the following steps show how to convert other format files to VCF format.
+
Use PLINK2 (available [https://www.cog-genomics.org/plink2 here]) as follows:
 
 
* '''PLINK:''' Use PLINK2 (available [https://www.cog-genomics.org/plink2 here]) as follows:
 
  
 
  plink --bfile Gwas.Chr20.Phased.Output \
 
  plink --bfile Gwas.Chr20.Phased.Output \
Line 12: Line 11:
 
       --out Gwas.Chr20.Phased.Output.VCF.format
 
       --out Gwas.Chr20.Phased.Output.VCF.format
  
* '''MaCH:''' Use Mach2VCF (coming soon) as follows:
+
 
 +
= Convert '''MaCH''' Files =
 +
 
 +
Use Mach2VCF (coming soon) as follows:
  
 
  mach2VCF --haps Gwas.Chr20.Phased.Output.hap \
 
  mach2VCF --haps Gwas.Chr20.Phased.Output.hap \
Line 18: Line 20:
 
           --prefix Gwas.Chr20.Phased.Output.VCF.format
 
           --prefix Gwas.Chr20.Phased.Output.VCF.format
  
* '''SHAPEIT:''' Use SHAPEIT (available [https://mathgen.stats.ox.ac.uk/genetics_software/shapeit/shapeit.html#download here]) as follows:
+
 
 +
= Convert '''SHAPEIT''' Files =
 +
 
 +
Use SHAPEIT (available [https://mathgen.stats.ox.ac.uk/genetics_software/shapeit/shapeit.html#download here]) as follows:
  
 
  shapeit -convert \
 
  shapeit -convert \
Line 24: Line 29:
 
         --output-vcf Gwas.Chr20.Phased.Output.VCF.format.vcf
 
         --output-vcf Gwas.Chr20.Phased.Output.VCF.format.vcf
  
=== Download Reference Panel ===
 
 
Commonly used reference panels are 1000 Genomes Phase 3 (2,535 samples), 1000 Genomes Phase 1 (1,094 samples), HapMap2 (269 samples), Haplotype Reference Consortium (32,914 samples) etc. Users are advised to use either 1000 Genomes Phase 3 (available for download in [[#Reference Panels for Download |Reference Panels ]]) or the Haplotype Reference Consortium (which due to data privacy issues cannot be shared publicly but can be used for imputation remotely on a server through a [http://imputationserver.sph.umich.edu/ imputation server] setup at University of Michigan). Reference panels for different versions of 1000 Genomes, in both VCF and <code>M3VCF</code> format, are available for download in [[#Reference Panels for Download |Reference Panels]].
 
 
=== Impute Samples ===
 
 
The final step for imputation involves running '''Minimac3''' to perform the imputation analysis. Now that we have the pre-phased GWAS panel (in VCF format) and the appropriate reference panel (in VCF or <code>M3VCF</code> format), we can run Minimac3 as follows. In the following examples, the first one uses a VCF file for reference (that can be obtained as explained above) and the second example uses a <code>M3VCF</code> file (that might have been downloaded from the links [[#Reference Panels for Download|below]] or created on a previous run of Minimac3).
 
  
../bin/Minimac3 --refHaps ReferencePanel.Chr20.1000Genomes.vcf \
+
= Contact =
                --haps Gwas.Chr20.Phased.Output.VCF.format.vcf \
 
                --prefix Gwas.Chr20.Imputed.Output
 
  
../bin/Minimac3 --refHaps ReferencePanel.Chr20.1000Genomes.m3vcf \
+
In case of any queries and bugs please contact [mailto:sayantan@umich.edu Sayantan Das].
                --haps Gwas.Chr20.Phased.Output.VCF.format.vcf \
 
                --prefix Gwas.Chr20.Imputed.Output
 

Revision as of 22:07, 29 January 2015

Introduction

After the pre-phasing has been done, we can begin to run the imputation. But before that, we need to convert our phased GWAS panel files (obtained above) to VCF format (since Minimac3 can only use VCF format files). If pre-phased GWAS data is available in VCF format, users can skip this step. Otherwise, the following steps show how to convert other format files to VCF format.

Convert PLINK Files

Use PLINK2 (available here) as follows:

plink --bfile Gwas.Chr20.Phased.Output \
      --recode vcf \
      --out Gwas.Chr20.Phased.Output.VCF.format


Convert MaCH Files

Use Mach2VCF (coming soon) as follows:

mach2VCF --haps Gwas.Chr20.Phased.Output.hap \
         --snps Gwas.Chr20.Phased.Output.snps \
         --prefix Gwas.Chr20.Phased.Output.VCF.format


Convert SHAPEIT Files

Use SHAPEIT (available here) as follows:

shapeit -convert \
        --input-haps Gwas.Chr20.Phased.Output \
        --output-vcf Gwas.Chr20.Phased.Output.VCF.format.vcf


Contact

In case of any queries and bugs please contact Sayantan Das.