Changes

From Genome Analysis Wiki
Jump to navigationJump to search
55 bytes added ,  16:15, 14 February 2012
no edit summary
Line 138: Line 138:  
= Example =
 
= Example =
   −
Qplot can generate diagnostic graphs, related R code and summary statistics for each sam/bam files.
+
Qplot can generate diagnostic graphs, related R code, and summary statistics for each SAM/BAM file.
   −
== Build-in example ==
+
== Built-in example ==
   −
In pre-compiled binary file, you will find a subdirectory named examples. We provide a sample file from 1000 Genome project, it contained aligned read on chromosome 20 from position 8 Mbp to 9Mbp. You can use qplot using the following commandline:
+
In the pre-compiled binary download, you will find a subdirectory named examples. We provide a sample file from the 1000 Genome project, it contains aligned reads on chromosome 20 from position 8 Mbp to 9Mbp. You can use qplot using the following commandline:
    
  ../bin/qplot --reference ../data/human.g1k.v37.umfa --dbsnp ../data/dbSNP130.UCSC.coordinates.tbl --gccontent ../data/human.g1k.w100.gc --plot qplot.pdf --stats qplot.stats --Rcode qplot.R --label "chr20:9M-10M" chrom20.9M.10M.bam
 
  ../bin/qplot --reference ../data/human.g1k.v37.umfa --dbsnp ../data/dbSNP130.UCSC.coordinates.tbl --gccontent ../data/human.g1k.w100.gc --plot qplot.pdf --stats qplot.stats --Rcode qplot.R --label "chr20:9M-10M" chrom20.9M.10M.bam
Line 184: Line 184:       −
* Whole genome sequencing with more than one lanes
+
* Whole genome sequencing with more than one lane
    
Figures
 
Figures
Line 220: Line 220:  
* Whole genome sequencing with 24-multiplexing
 
* Whole genome sequencing with 24-multiplexing
   −
With customized script, we aggregated 24 bar-coded samples in the same graph.
+
With a customized script, we aggregated 24 bar-coded samples in the same graph.
 
The graph will help compare sequencing quality between samples.  
 
The graph will help compare sequencing quality between samples.  
   Line 229: Line 229:     
<span id="anchorOfInteractiveQplot"></span>
 
<span id="anchorOfInteractiveQplot"></span>
Qplot can be interactive. In the following example, you can use scroll mouse to zoom in, zoom out each graph; pan to certain part of graph.
+
Qplot can be interactive. In the following example, you can use mouse scroll to zoom in and zoom out on each graph and pan to a certain part of the graph.
By presenting qplot data in web page, users can identify problematic sequencing samples easily. Users of qplot customized its outputs into webpage.
+
By presenting qplot data on a web page, users can easily identify problematic sequencing samples. Users of qplot can customize its outputs into webpage format greatly easing the data exploring process.
That greatly eases data exploring process.
      
[http://www-personal.umich.edu/~zhanxw/qplot.Pool.9847.html  QPlot of 24 samples(HTML) ]
 
[http://www-personal.umich.edu/~zhanxw/qplot.Pool.9847.html  QPlot of 24 samples(HTML) ]
Line 237: Line 236:  
== Diagnose sequencing quality ==
 
== Diagnose sequencing quality ==
   −
Qplot is designed and implemented by the need of checking sequencing quality.  
+
Qplot is designed and implemented for the need of checking sequencing quality.  
Besides the exampled of analyzing RNA-seq data as shown in our manuscript,  
+
Besides the example of analyzing RNA-seq data as shown in our manuscript,  
 
here we demonstrate two additional scenarios in which qplot can help identify problems after obtaining sequencing data.  
 
here we demonstrate two additional scenarios in which qplot can help identify problems after obtaining sequencing data.  
   Line 244: Line 243:  
* Base quality distributed abnormally
 
* Base quality distributed abnormally
   −
[[Media: WrongBaseQual.pdf | Example of qplot help identify wrong phred base quality]]
+
[[Media: WrongBaseQual.pdf | Example of qplot helping to identify wrong phred base quality]]
   −
By checking the first graph "Empirical vs reported Phred score", we found reported base qualities are shifted to right.
+
By checking the first graph "Empirical vs reported Phred score", we found reported base qualities are shifted to the right.
 
Further we notice that effects is caused by different software from Illumina sequencers.  
 
Further we notice that effects is caused by different software from Illumina sequencers.  
In this particular example, all base qualities are wrongly added '33'. Such data used in variant calling may increase false positive SNP calling.
+
In this particular example, '33' was incorrectly added to all base qualities. Such data used in variant calling may increase false positive SNP calling.
      Line 255: Line 254:  
[[Media: WrongBarCoding.pdf | Example of qplot identifying the effect of ignoring bar-coding]]
 
[[Media: WrongBarCoding.pdf | Example of qplot identifying the effect of ignoring bar-coding]]
   −
By checking "Empirical phred score by cycle" (top right graph on the first page), we noticed the empirical qualities in the first several cycle are abnormally low. This phenomenon leads us to hypothesize the first several bases have different properties. Further investigation confirmed that this sequencing was done using bar-coded DNA samples, but the analysis did not properly de-multiplexing to each sample.
+
By checking "Empirical phred score by cycle" (top right graph on the first page), we noticed the empirical qualities in the first several cycles are abnormally low. This phenomenon leads us to hypothesize that the first several bases have different properties. Further investigation confirmed that this sequencing was done using bar-coded DNA samples, but the analysis did not properly de-multiplex each sample.
    
= Contact =
 
= Contact =
    
Questions and requests should be sent to Bingshan Li ([mailto:bingshan@umich.edu bingshan@umich.edu])
 
Questions and requests should be sent to Bingshan Li ([mailto:bingshan@umich.edu bingshan@umich.edu])

Navigation menu