Line 1: |
Line 1: |
| + | = Introduction = |
| | | |
− | Chromosome X has a pseudo-autosomal region (PAR) which can be imputed for males and females together. Imputing the PAR on chromosome X is same as usual imputation, since both males and females are diploids at these sites. However, the non pseudo-autosomal region needs to be imputed for males and females separately, as males are haploids while females are diploids. Of course, the PAR and non-PAR regions need to be imputed separately. See our wiki page on [[Minimac3 Cookbook : Chromosome X Imputation |Chromosome X Imputation]] for details on imputing chromosome X.
| + | [http://genome.sph.umich.edu/wiki/Minimac3 '''Minimac3 '''] is a lower memory and more computationally efficient implementation of [http://genome.sph.umich.edu/wiki/Minimac2 minimac2]. It is an algorithm for genotypic imputation that works on phased genotypes and is designed to handle very large reference panels in a more computationally efficient way with no loss of accuracy. |
| | | |
− | The following example illustrates imputation on the non-PAR of chromosome X for males and females separately (files available in <code>Minimac3/test/</code> directory)
| + | This wiki page is designed to give users a '''detailed step-by-step description on imputing chromosome X'''. |
| | | |
− | Male Samples (Non-PAR)
| + | = Chromosome X Imputation = |
− | ../bin/Minimac3 --refHaps refPanelChrX.Non.Auto.vcf --haps targetStudyChrX.males.vcf --prefix testRun
| |
| | | |
− | Female Samples (Non-PAR)
| + | Chromosome X has a pseudo-autosomal region (PAR) which can be imputed for males and females together. Imputing the PAR on chromosome X is same as usual imputation, since both males and females are diploids at these sites. However, the non pseudo-autosomal region (non-PAR) needs to be imputed for males and females separately, as males are haploids while females are diploids. Of course, the PAR and non-PAR regions need to be imputed separately. Following should be the steps involved in imputing chromosome X. |
− | ../bin/Minimac3 --refHaps refPanelChrX.Non.Auto.vcf --haps targetStudyChrX.females.vcf --prefix testRun
| |
| | | |
− | NOTE: For imputing non-PAR of chromosome X, user must analyze male and female samples separately, otherwise program would crash. User should also ensure that the reference panel consists of only PAR or non-PAR region of chromosome X, otherwise program would crash. | + | * '''Convert files to VCF Format:''' Start by converting the unphased, quality controlled data set into VCF format. See our wiki page on [[Minimac3 Cookbook : Converting Files to VCF| Converting to VCF]] for more details on how to convert. |
| + | |
| + | * '''Split the data into PAR and non-PAR:''' Separate the pseudo-autosomal part and non-pseudo-autosomal part into separate files. The non-PAR is located on <font face=Courier>'''chrX:2699520-154931043'''</font> on build hg19. The split can be done for VCF files as follows. |
| + | |
| + | vcftools --gzvcf gwas.data.vcf.gz \ |
| + | --chr X \ |
| + | --from-bp 2699520 \ |
| + | --to-bp 154931043 \ |
| + | --recode \ |
| + | --out Non.PAR.gwas.data |
| + | |
| + | vcftools --gzvcf gwas.data.vcf.gz \ |
| + | --exclude-positions Non.PAR.gwas.data.recode.vcf \ |
| + | --recode \ |
| + | --out PAR.gwas.data |
| + | |
| + | '''NOTE''': After this step, please verify that the male samples have only one haplotype in <font face=Courier>Non.PAR.gwas.data.recode.vcf</font> and two haplotypes in <font face=Courier>PAR.gwas.data.recode.vcf</font> |
| + | |
| + | * '''Split the non-PAR data by Sex:''' Separate the non-PAR data by sex, which can also be done by vcftools as follows. Note that the <font face=Courier>PAR.gwas.data.recode.vcf</font> need NOT be separated since both males and females are diploids there. |
| + | |
| + | vcftools --vcf Non.PAR.gwas.data.recode.vcf \ |
| + | --keep male.sample.list ## or female.sample.list \ |
| + | --recode \ |
| + | --out Male.Non.PAR.gwas.data ## or Female.Non.PAR.gwas.data \ |
| + | |
| + | * '''Pre-phase PAR data and female non-PAR data:''' Out of the three available data, only the PAR data and female non-PAR data have two haplotypes and thus need to be phased, while the male non-PAR data has haploids and need not be phased. See our wiki page on [[Minimac3 Cookbook : Pre-Phasing| Pre-Phasing]] and [[Minimac3 Cookbook : Converting Files to VCF| Converting to VCF]] for further details on pre-phasing and converting files back to VCF format. |
| + | |
| + | * '''Impute Data:''' The following example illustrates how to impute into the phased PAR data (both males and females together), phased female non-PAR data and haploid male non-PAR data (same as obtained after splitting the non-PAR by sex) as follows: |
| + | |
| + | # Phased All Samples (PAR) |
| + | ../bin/Minimac3 --refHaps refPanelChrX.Auto.vcf \ |
| + | --haps Phased.PAR.gwas.data.vcf \ |
| + | --prefix testRun.All.PAR |
| + | |
| + | # Phased Female Samples (Non-PAR) |
| + | ../bin/Minimac3 --refHaps refPanelChrX.Non.Auto.vcf \ |
| + | --haps Phased.Female.Non.PAR.gwas.data.vcf \ |
| + | --prefix testRun.females.Non.PAR |
| + | |
| + | # Haploid Male Samples (Non-PAR) |
| + | ../bin/Minimac3 --refHaps refPanelChrX.Non.Auto.vcf \ |
| + | --haps Male.Non.PAR.gwas.data.recode.vcf \ |
| + | --prefix testRun.males.Non.PAR |
| + | |
| + | * '''NOTE:''' For imputing non-PAR of chromosome X, user must analyze male and female samples separately, otherwise program would crash. User should also ensure that the reference panel consists of only PAR or non-PAR region of chromosome X, otherwise program would crash. |
| + | |
| + | = Download = |
| + | |
| + | '''Minimac3 ''' is currently available as a pre-release. The source files (and binary executable) are available for download in [[Minimac3#Download | Source Files]] and commonly used reference panels in VCF and <font face=Courier>M3VCF</font> formats are available for download in [[Minimac3#Reference Panels for Download | Reference Panels]]. |
| + | |
| + | = Useful Wiki Pages = |
| + | |
| + | There are a few pages in this Wiki that may be useful to for '''Minimac3''' users. Here are links to a few: |
| + | |
| + | * [[Minimac3| Minimac3 Overview Page]] |
| + | |
| + | * [[Minimac3 Usage | Minimac3 Usage and Documentation]] |
| + | |
| + | * [[Minimac3 Imputation Cookbook]] ('''Recommended for New Users!!''') |
| + | |
| + | * [[Minimac3 Cookbook : Chromosome X Imputation | Chromosome X Imputation ]] |
| + | |
| + | * [[Minimac3 Cookbook : Pre-Phasing | Pre-Phasing ]] |
| + | |
| + | * [[Minimac3 Examples| Minimac3 Examples]] |
| + | |
| + | * [[M3VCF Files| M3VCF Files]] |
| + | |
| + | = Contact = |
| + | |
| + | In case of any queries and bugs please contact [mailto:sayantan@umich.edu Sayantan Das]. |