From Genome Analysis Wiki
Jump to navigationJump to search
729 bytes added
, 15:27, 9 July 2013
Line 70: |
Line 70: |
| You must specify two columns, the marker and the frequency (in that order). So if you use PLINK to calculate allele frequencies, you would specify "--freqcols=2,5". Also note that only the minor allele frequency is used, so if the frequency provided is >0.50, it is automatically reduced by 0.50 during analysis. | | You must specify two columns, the marker and the frequency (in that order). So if you use PLINK to calculate allele frequencies, you would specify "--freqcols=2,5". Also note that only the minor allele frequency is used, so if the frequency provided is >0.50, it is automatically reduced by 0.50 during analysis. |
| | | |
| + | |
| + | == Plotting == |
| + | |
| + | We're provided basic support for plotting samples if you would like to visually inspect them for anomalies as well. You would use |
| + | $ python bafRegress.py plot --freqfile popmaf.txt finalreport.txt |
| + | or |
| + | $ python bafRegress.py plotbin --sample SAMPLENAME mydata |
| + | depending on whether you are using the Final Report files or the binary versions. |
| + | |
| + | Here is an example of the plot for an uncontaminated vs a contaminated sample. Note how the uncontaminated sample has values closer to the expected BAF of 0 and 1 for the homozygotes. Also note that in the contaminated sample, the deviance from the expected seems to increase as a function of the minor allele frequency (MAF). |
| + | |
| + | [[File:Bafregressplotexample.gif]] |
| | | |
| == Interpreting Results == | | == Interpreting Results == |