From Genome Analysis Wiki
Jump to: navigation, search


55 bytes added, 07:29, 24 June 2010
How do I get imputation quality estimates?
== How do I get imputation quality estimates? ==
A simple approach is to use --mask option(in the second step alone if using two-step imputation). For example, --mask 0.02 masks 2% of the genotypes at random, impute them and compare with the masked original to estimate genotypic and allelic error rates. Messages like the following will be generated to stdout:
Comparing 948352 masked genotypes with MLE estimates ...
A better approach is to mask a small proportion of SNPs (vs. genotypes in the above simple approach). One can generate a mask.dat from the original .dat file by simply changing the flag of a subset of markers from M to S2 without duplicating the .ped file. Post-imputation, one can use   [ CalcMatch ]and [ ]to estimate genotypic/allelic error rate and correlation respectively. Both programs can be downloaded from [].
'''Warning''': Imputation involving masked datasets should be performed separately for imputation quality estimation. For production, one should use all available information.
== Shall I apply QC before or after imputation? If so, how? ==

Navigation menu