Changes

From Genome Analysis Wiki
Jump to navigationJump to search
531 bytes added ,  16:01, 24 May 2010
Line 165: Line 165:  
=== '''How is AL1 defined? Which allele dosage is .dose/.mldose counting?'''  ===
 
=== '''How is AL1 defined? Which allele dosage is .dose/.mldose counting?'''  ===
   −
A: AL1 is an arbitrary allele. To be specific, it is the first allele read in the reference haplotypes (file fed to -h or --haps). The earliest versions of mach1 counted the number of AL2 and the latest versions count the number of AL1. One can find out which allele is counted following the steps below. <br>
+
A: AL1 is an arbitrary allele. Typically, it is the first allele read in the reference haplotypes (file fed to -h or --haps). The earliest versions (prior to April 2007) of mach counted the expected number copies of AL2 and more recent versions count the number of AL1. One can find out which allele is counted following the steps below.
   −
Take your dosage, geno, and info output (.dose, .geno and one each from .info/.mlinfo and .dose/.mldose) and check if dosage is the number of AL1 copies or AL2 copies. Example is given below:
+
#. First, find the two alleles for one of the markers in your data
    
<source lang="text">
 
<source lang="text">
  prompt> head -1 mldose/chr21.mldose | cut -f3 -d ' '  
+
  prompt> head -2 mlinfo/chr21.mlinfo
 +
SNP      Al1 Al2 Freq1  MAF    Quality  Rsq
 +
rs885550 2  4  0.9840  0.0160  0.9682  0.992
 +
</source>
 +
 
 +
#. Second, check the dosage for a few individuals at this SNP.
 +
 
 +
<source lang="text">
 +
prompt> head -3 mldose/chr21.mldose | cut -f3 -d ' '  
 
  1.962  
 
  1.962  
 +
1.000
 +
0.078
 +
</source>
   −
prompt> head -2 mlinfo/chr21.mlinfo
+
#. Finally, compare these dosages to genotypes.
SNP Al1 Al2 Freq1 MAF Quality Rsq
  −
rs885550 2 4 0.9840 0.0160 0.9682 0.0021
      +
<source lang="text">
 
  prompt> head -1 mlgeno/chr21.mlgeno | cut -f3 -d ' '  
 
  prompt> head -1 mlgeno/chr21.mlgeno | cut -f3 -d ' '  
 
  2/2  
 
  2/2  
 +
2/4
 +
4/4
 
</source>
 
</source>
 +
 +
In this example, you can see that the first individual has a high dosage count (1.962) and most likely genotype 2/2. The last individual has a low dosage count and most likely genotype 4/4. Thus, the output corresponds to version of Mach released after April 2007, which should tally allele 1 counts.
 +
 +
Note that, on the example above, .mldose could be replaced with .dose and .mlgeno could be replaced with .geno.
    
Based on the three files above, we've confirmed that dosage is the number of AL1 copies: you will only to check for one informative case (i.e, dosage values close to 0 or 2) since it's consistent across all individuals and all SNPs.
 
Based on the three files above, we've confirmed that dosage is the number of AL1 copies: you will only to check for one informative case (i.e, dosage values close to 0 or 2) since it's consistent across all individuals and all SNPs.

Navigation menu