Changes

From Genome Analysis Wiki
Jump to navigationJump to search
540 bytes added ,  15:35, 20 May 2010
no edit summary
Line 103: Line 103:  
  mach1 -d sample.dat -p sample.ped -s chr20.snps -h chr20.hap --compact --greedy --autoFlip --errorMap par_infer.erate --crossoverMap par_infer.rec --mle --mldetails > mach.imp.log  
 
  mach1 -d sample.dat -p sample.ped -s chr20.snps -h chr20.hap --compact --greedy --autoFlip --errorMap par_infer.erate --crossoverMap par_infer.rec --mle --mldetails > mach.imp.log  
    +
<br> '''Q: Where can I find combined HapMap reference files? '''<br> A: http://www.sph.umich.edu/csg/yli/mach/download/HapMap-r21.html <br><br>
   −
'''Q: Where can I find combined HapMap reference files? '''<br> A: http://www.sph.umich.edu/csg/yli/mach/download/HapMap-r21.html <br><br>
+
'''Q: Where can I find HapMap III reference files? '''  
   −
'''Q: Where can I find HapMap III reference files? '''<br> A: http://www.sph.umich.edu/csg/yli/mach/download/ <br><br>  
+
'''Q: Where can I find 1000 Genomes reference files?'''<br> A: http://www.sph.umich.edu/csg/yli/mach/download/ <br>  
    
'''Q: Does --mle overwrite fed-in genotypes?'''<br> A: Yes. But rarely. --mle outputs the most likely genotype guesses by integrating over the probabilities of all possible configurations based on the reference haplotypes. The overwriting happens when the most likely guess differs from the experimental counterpart.<br><br>  
 
'''Q: Does --mle overwrite fed-in genotypes?'''<br> A: Yes. But rarely. --mle outputs the most likely genotype guesses by integrating over the probabilities of all possible configurations based on the reference haplotypes. The overwriting happens when the most likely guess differs from the experimental counterpart.<br><br>  
 +
 +
'''Q: How do I get imputation quality estimates?'''
 +
 +
A: A simple approach is to use --mask option. For example, --mask 0.02 masks 2% of the genotypes at random, impute them and compare with the masked original to estimate genotypic and allelic error rates. Messages like the following will be generated to stdout:
 +
 +
Comparing 948352 masked genotypes with MLE estimates ...<br> Estimated per genotype error rate is 0.0568<br> Estimated per allele error rate is 0.0293<br><br>
 +
 +
<br>
    
'''Q: How do I get reference files for an region of interest? '''<br> A: (1) For HapMapII format, download http://www.sph.umich.edu/csg/ylwtx/HapMapForMach.tgz <br>  
 
'''Q: How do I get reference files for an region of interest? '''<br> A: (1) For HapMapII format, download http://www.sph.umich.edu/csg/ylwtx/HapMapForMach.tgz <br>  
Line 125: Line 134:  
   awk '{print $3}' orig.hap | cut -c${first}-${last} &gt; region.hap
 
   awk '{print $3}' orig.hap | cut -c${first}-${last} &gt; region.hap
   −
<br>'''Q: Do I have to sort the pedigree file by physical positions? '''<br> A: If you use external reference, you do not have to as long as the external reference is in correct order. **HOWEVER**, we strongly recommend sorting the pedigree files. <br><br>  
+
<br>'''Q: Do I have to sort the pedigree file by physical positions? '''<br> A: If you use external reference, you do not have to as long as the external reference is in correct order. **HOWEVER**, we strongly recommend sorting the pedigree files. <br><br>
    
<br>'''Q: What if I specified --states R where R exceeds the maximum possible (2*number diploid individuals - 2 + number_haplotypes)? '''<br> A: mach automatically resets it to maximum possible value.  
 
<br>'''Q: What if I specified --states R where R exceeds the maximum possible (2*number diploid individuals - 2 + number_haplotypes)? '''<br> A: mach automatically resets it to maximum possible value.  
   −
<br>  
+
<br>
    
'''Q: Can I used unphased reference?'''<br>  
 
'''Q: Can I used unphased reference?'''<br>  
212

edits

Navigation menu