Line 127: |
Line 127: |
| == Imputation into Phased Haplotypes == | | == Imputation into Phased Haplotypes == |
| | | |
− | Imputing genotypes using '''minimac''' is an easy straightforward process: after selecting a set of reference haplotypes, plugging-in the target haplotypes from the previous step and setting the number of rounds to use for the model parameter estimation, imputation should proceed rapidly. | + | Imputing genotypes using '''minimac''' is an easy and straightforward process: after selecting a set of reference haplotypes, plugging-in the target haplotypes from the previous step and setting the number of rounds to use for the model parameter estimation, imputation should proceed rapidly. |
| | | |
| === Running Minimac === | | === Running Minimac === |
Line 166: |
Line 166: |
| === Reference Haplotypes === | | === Reference Haplotypes === |
| | | |
− | Reference haplotypes generated by the 1000 Genomes project and formatted so that they are ready for analysis are available from the [http://www.sph.umich.edu/csg/abecasis/MACH/download/1000G-2010-08.html MaCH download page]. The most recent set of haplotypes are based on genotype calls from August 2010. | + | Reference haplotypes generated by the 1000 Genomes project and formatted so that they are ready for analysis are available from the [http://www.sph.umich.edu/csg/abecasis/MACH/download/1000G-2010-08.html MaCH download page]. The most recent set of haplotypes are based on genotype calls from the August 2010 data freeze. |
| | | |
| === Imputation quality evaluation === | | === Imputation quality evaluation === |
− | Minimac drops each of the genotyped SNPs in turn and then calculates 3 statistics: | + | To evaluate imputation quality, Minimac hides data for each genotyped SNP in turn and calculates 3 statistics: |
| * looRSQ - this is the estimated rsq for that SNP (as if SNP weren't typed). | | * looRSQ - this is the estimated rsq for that SNP (as if SNP weren't typed). |
| * empR - this is the empirical correlation between true and imputed genotypes for the SNP. If this is negative, the SNP is probably flipped. | | * empR - this is the empirical correlation between true and imputed genotypes for the SNP. If this is negative, the SNP is probably flipped. |
| * empRSQ - this is the actual R2 value, comparing imputed and true genotypes. | | * empRSQ - this is the actual R2 value, comparing imputed and true genotypes. |
| | | |
− | These statistics can be found in the *.info file | + | These statistics can be found in the .info file |
| | | |
| === X Chromosome Imputation === | | === X Chromosome Imputation === |
Line 187: |
Line 187: |
| | | |
| | | |
− | :::: '''<Example of a male only pedigree file >''' | + | :::: '''<Example of a male only pedigree file>''' |
− | :::: FAM1003 ID1234 0 0 M A/0 A/0 C/0 | + | :::: FAM1003 ID1234 0 0 M A/A A/A C/C |
− | :::: FAM1004 ID5678 0 0 M 0/0 C/0 G/0 | + | :::: FAM1004 ID5678 0 0 M 0/0 C/0 G/G |
| :::: ... | | :::: ... |
| :::: '''<End of pedigree file>''' | | :::: '''<End of pedigree file>''' |
| | | |
| + | ''Note that, consistent with the Merlin convention, hemizygous males are listed as if they were homozygous.'' |
| | | |
| :::: '''<Example of the corresponding haplotype file>''' | | :::: '''<Example of the corresponding haplotype file>''' |
Line 201: |
Line 202: |
| :::: ... | | :::: ... |
| :::: '''<End of the corresponding haplotype file>''' | | :::: '''<End of the corresponding haplotype file>''' |
− |
| |
| | | |
| = post-imputation association analysis = | | = post-imputation association analysis = |