Changes

From Genome Analysis Wiki
Jump to navigationJump to search
15 bytes removed ,  15:16, 14 February 2012
no edit summary
Line 1: Line 1:  
= Introduction =
 
= Introduction =
The qplot program is to calculate various summary statistics some of which will be plotted in a PDF file which can be used to assess the sequencing quality for Illumina sequencing after mapping reads to the reference genome. The main statistics are empirical Phred scores which was calculated based on the background mismatch rate. By background mismatch rate, it means the rate that sequenced bases are different from the reference genome, EXCLUDING dbSNP positions. Other statistics include GC biases, insert size distribution, depth distribution, genome coverage, empirical Q20 count and so on.  
+
The qplot program calculates various summary statistics some of which will be plotted in a PDF file that can be used to assess the sequencing quality for Illumina sequencing after mapping reads to the reference genome. The main statistics are empirical Phred scores which are calculated based on the background mismatch rate. Background mismatch rate is the rate that sequenced bases are different from the reference genome, EXCLUDING dbSNP positions. Other statistics include GC biases, insert size distribution, depth distribution, genome coverage, empirical Q20 count, and so on.  
   −
In the following sections, we will guide through: [[#Where to Find It |how to obtain qplot]], [[#Usage |how to use qplot]], [[#Build-in example |example outputs]], [[#anchorOfInteractiveQplot |interactive diagnostic plots]] and [[#Diagnose sequencing quality |real applications]] in which qplot has helped identify sequencing problems.
+
In the following sections, we will guide you through: [[#Where to Find It |how to obtain qplot]], [[#Usage |how to use qplot]], [[#Build-in example |example outputs]], [[#anchorOfInteractiveQplot |interactive diagnostic plots]], and [[#Diagnose sequencing quality |real applications]] in which qplot has helped identify sequencing problems.
    
= Where to Find It =
 
= Where to Find It =
Line 12: Line 12:  
(2) Download source code of qplot and compile it on your own machine. Please follow the instruction in [[#Source Code Distribution|Source Code Distribution]] on fetching source code and building instructions.
 
(2) Download source code of qplot and compile it on your own machine. Please follow the instruction in [[#Source Code Distribution|Source Code Distribution]] on fetching source code and building instructions.
   −
We suggest use the first method as we try to make pre-compiled binary working out of the box.
+
We recommend the first method since the pre-compiled binary should work out of the box.
    
== Binary Download ==
 
== Binary Download ==
   −
We have prepared pre-compiled qplot. You can download from: [http://www.sph.umich.edu/csg/zhanxw/software/qplot/qplot.20120213.tar.gz qplot.20120213.tar.gz (File Size: 1.7G)]  
+
We have prepared a pre-compiled qplot. You can download it from: [http://www.sph.umich.edu/csg/zhanxw/software/qplot/qplot.20120213.tar.gz qplot.20120213.tar.gz (File Size: 1.7G)]  
    
The executable file is under qplot/bin/qplot.  
 
The executable file is under qplot/bin/qplot.  
   −
In addition, we provided necessary inputs files (NCBI human genome build v37, dbSNP 130 and pre-computed GC file with windows size 100, they are all under qplot/data/).
+
In addition, we provided the necessary input files under qplot/data/ (NCBI human genome build v37, dbSNP 130, and pre-computed GC file with windows size 100).
   −
You can also find example BAM input file under qplot/example/chrom20.9M.10M.bam. it is taken from 1000 Genome Project with sequencing reads aligned to chromosome 20 positioned 8M to 9M.
+
You can also find an example BAM input file under qplot/example/chrom20.9M.10M.bam. It is taken from the 1000 Genome Project with sequencing reads aligned to chromosome 20 positions 8M to 9M.
    
== Source Code Distribution ==
 
== Source Code Distribution ==

Navigation menu