Changes

From Genome Analysis Wiki
Jump to: navigation, search

Talk:QPLOT

194 bytes added, 16:20, 21 November 2011
no edit summary
(1) Empirical vs reported Phred score:
 
Conditioning on the reported base quality, we count the total time of bases that matches or not matches the reference genome, and thus calculate the empirical quality by : -10 * log10 ( 1 - total # of mismatched / total bases) . In following cases, we will not use that bases for calculating empirical qualities:
(2) Empirical Phred score by cycle:
 
Conditioning on read cycle (e.g. first base, second base... be cautious using quality trimmed reads or bar-coded reads, as the real cycle may differ), we calculate empirical quality as above.
If specifying --region, only bases falling in the target regions will be calculated.
(3) Mean depth vs. GC
 
We will count depth for whole genome or specified region (--region).
Default GC window size is 100.
(4) Insert size
 
For mapped paired-end reads, the insert size distribution will be ploted. Otherwise, this graph would be empty.
Specifying --region will not affect this graph.
(5) Empirical Q20 bases count by cycle
 
We count the number of Q20 bases (base qualities that are larger than 20) conditioning on cycle number.
If specifying --regions, only bases in the target regions will be calculated. In such case, some reads will have their head and trail outside of the region. Thus you will likely to see a parabolic shape.
(6) Flag stats
 
We count the number of reads in these categories: total, mapped, paired, proper paired, duplicated, QC failed.
These categories are determined by FLAG field from each BAM file.
(7) Mean depth of sequencing
 Total mapped bases divided by total number of positions that are covered by at least one base. The y-axis, percentage is calculated by sites divided by total sites (e.g. for whole genome, it's the total length; for target sequencing, it's the total length of all targeted region).
(8) Empirical Q20 count
 
We examine each base by its reported base quality, if that reported base quality corresponds to empirical base quality bettern Phred score 20, than we will count once as Q20 base.
255
edits

Navigation menu