Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 253: Line 253:     
Let's look at the output directory:
 
Let's look at the output directory:
  ls ${OUTPUT}
+
  ls ${OUT}
 
[[File:gcalignOutM.png|600px]]
 
[[File:gcalignOutM.png|600px]]
    +
=== BAM Files ===
 
Let's look at the BAMs (aligned reads that are ready for variant calling):
 
Let's look at the BAMs (aligned reads that are ready for variant calling):
  ls ${OUTPUT}/bams
+
  ls ${OUT}/bams
 
[[File:GcalignOutBAMm.png|600px]]
 
[[File:GcalignOutBAMm.png|600px]]
   −
BAM Files:
   
* Binary Sequence Alignment/Map (SAM) Format
 
* Binary Sequence Alignment/Map (SAM) Format
 
* Maps reads to Chromosome/Position
 
* Maps reads to Chromosome/Position
Line 271: Line 271:  
*** Records - one for each sequence read
 
*** Records - one for each sequence read
 
Let's examine a BAM file:
 
Let's examine a BAM file:
  samtools view -h ${OUTPUT}/bams/
+
  samtools view -h ${OUT}/bams/
 
[[File:BAM.png|750px]]
 
[[File:BAM.png|750px]]
    +
=== Quality Control Files ===
 
Let's take a look at our quality control output directory:
 
Let's take a look at our quality control output directory:
  ls ${OUTPUT}/QCFiles  
+
  ls ${OUT}/QCFiles  
 
[[File:GcalignOutQCm.png|600px]]
 
[[File:GcalignOutQCm.png|600px]]
    +
==== Sample Contamination/Swap ====
 
Check for sample contamination:
 
Check for sample contamination:
 
* *.selfSM : Main output file containing the contamination estimate.  
 
* *.selfSM : Main output file containing the contamination estimate.  
Line 288: Line 290:  
* *.depthSM : depth distribution of reads covering the marker position of the input VCF, across all readGroups.
 
* *.depthSM : depth distribution of reads covering the marker position of the input VCF, across all readGroups.
 
* *.depthRG : depth distribution of reads covering the marker position of the input VCF, per readGroups.
 
* *.depthRG : depth distribution of reads covering the marker position of the input VCF, per readGroups.
  less -S ${OUTPUT}/QCFiles/HG00551.genoCheck.selfSM
+
  less -S ${OUT}/QCFiles/HG00551.genoCheck.selfSM
 
[[File:Contam1.png|700px]]
 
[[File:Contam1.png|700px]]
    +
==== QC Metrics ====
 
Next, let's look at some quality control metrics:
 
Next, let's look at some quality control metrics:
 
  cat ${OUTPUT}/QCFiles/HG00551.qplot.stats
 
  cat ${OUTPUT}/QCFiles/HG00551.qplot.stats
Line 300: Line 303:       −
Generate the pdf's of our quality metrics:
+
Generate a pdf of quality metrics:
  Rscript ${OUTPUT}/QCFiles/HG00551.qplot.R
+
  Rscript ${OUT}/QCFiles/HG00551.qplot.R
Rscript ${OUTPUT}/QCFiles/HG00553.qplot.R
  −
Rscript ${OUTPUT}/QCFiles/HG00640.qplot.R
  −
Rscript ${OUTPUT}/QCFiles/HG00641.qplot.R
      
Examine the PDF:
 
Examine the PDF:
  evince  ${OUTPUT}/QCFiles/HG00551.qplot.pdf&
+
  evince  ${OUT}/QCFiles/HG00551.qplot.pdf&
 
The first plot: Empirical vs reported Phred score does not look as good as we would like.
 
The first plot: Empirical vs reported Phred score does not look as good as we would like.
 
* This is due to the small region used for recalibration
 
* This is due to the small region used for recalibration
 
Look at the PDF I produced when I ran the whole genome:
 
Look at the PDF I produced when I ran the whole genome:
  evince ${GC}/example/HG00551.wg.qplot.pdf&
+
  evince ${IN}/example/HG00551.wg.qplot.pdf&
    
See: [[QPLOT#Diagnose_sequencing_quality|QPLOT: Diagnose sequencing quality]] for more info on how to use QPLOT results.
 
See: [[QPLOT#Diagnose_sequencing_quality|QPLOT: Diagnose sequencing quality]] for more info on how to use QPLOT results.

Navigation menu