From Genome Analysis Wiki
Jump to navigationJump to search
667 bytes added
, 15:41, 20 May 2010
Line 115: |
Line 115: |
| A: A simple approach is to use --mask option. For example, --mask 0.02 masks 2% of the genotypes at random, impute them and compare with the masked original to estimate genotypic and allelic error rates. Messages like the following will be generated to stdout: | | A: A simple approach is to use --mask option. For example, --mask 0.02 masks 2% of the genotypes at random, impute them and compare with the masked original to estimate genotypic and allelic error rates. Messages like the following will be generated to stdout: |
| | | |
− | Comparing 948352 masked genotypes with MLE estimates ...<br> Estimated per genotype error rate is 0.0568<br> Estimated per allele error rate is 0.0293<br><br> | + | Comparing 948352 masked genotypes with MLE estimates ... |
| + | Estimated per genotype error rate is 0.0568 |
| + | Estimated per allele error rate is 0.0293 |
| | | |
− | <br> | + | A better approach is to mask a small proportion of SNPs (vs. genotypes in the above simple approach). One can generate a mask.dat from the original .dat file by simply changing the flag of a subset of markers from M to S2 without duplicating the .ped file. Post-imputation, one can use CalcMatch and doseR2.pl to estimate genotypic/allelic error rate and correlation respectively. Both programs can be downloaded from http://www.sph.umich.edu/csg/ylwtx/software.html<br> |
| + | |
| + | Warning: Imputation involving masked datasets should be performed separately for imputation quality estimation. For production, one should use all available information.<br> |
| | | |
| '''Q: How do I get reference files for an region of interest? '''<br> A: (1) For HapMapII format, download http://www.sph.umich.edu/csg/ylwtx/HapMapForMach.tgz <br> | | '''Q: How do I get reference files for an region of interest? '''<br> A: (1) For HapMapII format, download http://www.sph.umich.edu/csg/ylwtx/HapMapForMach.tgz <br> |