Changes

From Genome Analysis Wiki
Jump to navigationJump to search
155 bytes added ,  17:54, 10 March 2011
Line 127: Line 127:  
== Imputation into Phased Haplotypes ==
 
== Imputation into Phased Haplotypes ==
   −
Imputing genotypes using '''minimac''' is an easy straightforward process: after selecting a set of reference haplotypes, plugging-in the target haplotypes from the previous step and setting the number of rounds to use for the model parameter estimation, imputation should proceed rapidly.
+
Imputing genotypes using '''minimac''' is an easy and straightforward process: after selecting a set of reference haplotypes, plugging-in the target haplotypes from the previous step and setting the number of rounds to use for the model parameter estimation, imputation should proceed rapidly.
    
=== Running Minimac ===
 
=== Running Minimac ===
Line 166: Line 166:  
=== Reference Haplotypes ===
 
=== Reference Haplotypes ===
   −
Reference haplotypes generated by the 1000 Genomes project and formatted so that they are ready for analysis are available from the [http://www.sph.umich.edu/csg/abecasis/MACH/download/1000G-2010-08.html MaCH download page]. The most recent set of haplotypes are based on genotype calls from August 2010.
+
Reference haplotypes generated by the 1000 Genomes project and formatted so that they are ready for analysis are available from the [http://www.sph.umich.edu/csg/abecasis/MACH/download/1000G-2010-08.html MaCH download page]. The most recent set of haplotypes are based on genotype calls from the August 2010 data freeze.
    
=== Imputation quality evaluation ===
 
=== Imputation quality evaluation ===
Minimac drops each of the genotyped SNPs in turn and then calculates 3 statistics:
+
To evaluate imputation quality, Minimac hides data for each genotyped SNP in turn and calculates 3 statistics:
 
* looRSQ - this is the estimated rsq for that SNP (as if SNP weren't typed).  
 
* looRSQ - this is the estimated rsq for that SNP (as if SNP weren't typed).  
 
* empR - this is the empirical correlation between true and imputed genotypes for the SNP. If this is negative, the SNP is probably flipped.  
 
* empR - this is the empirical correlation between true and imputed genotypes for the SNP. If this is negative, the SNP is probably flipped.  
 
* empRSQ - this is the actual R2 value, comparing imputed and true genotypes.  
 
* empRSQ - this is the actual R2 value, comparing imputed and true genotypes.  
   −
These statistics can be found in the *.info file
+
These statistics can be found in the .info file
    
=== X Chromosome Imputation ===
 
=== X Chromosome Imputation ===
Line 187: Line 187:       −
::::  '''<Example of a male only pedigree file >'''
+
::::  '''<Example of a male only pedigree file>'''
:::: FAM1003  ID1234  0  0  M  A/0   A/0   C/0
+
:::: FAM1003  ID1234  0  0  M  A/A   A/A   C/C
:::: FAM1004  ID5678  0  0  M  0/0  C/0  G/0
+
:::: FAM1004  ID5678  0  0  M  0/0  C/0  G/G
 
::::  ...
 
::::  ...
 
::::  '''<End of pedigree file>'''
 
::::  '''<End of pedigree file>'''
    +
''Note that, consistent with the Merlin convention, hemizygous males are listed as if they were homozygous.''
    
::::  '''<Example of the corresponding haplotype file>'''
 
::::  '''<Example of the corresponding haplotype file>'''
Line 201: Line 202:  
::::  ...
 
::::  ...
 
::::  '''<End of the corresponding haplotype file>'''
 
::::  '''<End of the corresponding haplotype file>'''
      
= post-imputation association analysis =
 
= post-imputation association analysis =

Navigation menu