Changes

From Genome Analysis Wiki
Jump to navigationJump to search
729 bytes added ,  15:27, 9 July 2013
no edit summary
Line 70: Line 70:  
You must specify two columns, the marker and the frequency (in that order). So if you use PLINK to calculate allele frequencies, you would specify "--freqcols=2,5". Also note that only the minor allele frequency is used, so if the frequency provided is >0.50, it is automatically reduced by 0.50 during analysis.
 
You must specify two columns, the marker and the frequency (in that order). So if you use PLINK to calculate allele frequencies, you would specify "--freqcols=2,5". Also note that only the minor allele frequency is used, so if the frequency provided is >0.50, it is automatically reduced by 0.50 during analysis.
    +
 +
== Plotting ==
 +
 +
We're provided basic support for plotting samples if you would like to visually inspect them for anomalies as well. You would use
 +
    $ python bafRegress.py plot --freqfile popmaf.txt finalreport.txt
 +
or
 +
    $ python bafRegress.py plotbin --sample SAMPLENAME mydata
 +
depending on whether you are using the Final Report files or the binary versions.
 +
 +
Here is an example of the plot for an uncontaminated vs a contaminated sample. Note how the uncontaminated sample has values closer to the expected BAF of 0 and 1 for the homozygotes. Also note that in the contaminated sample, the deviance from the expected seems to increase as a function of the minor allele frequency (MAF).
 +
 +
[[File:Bafregressplotexample.gif]]
    
== Interpreting Results ==
 
== Interpreting Results ==

Navigation menu