Tutorial: EMMAX GotCloud STOM: Lecture 5

From Genome Analysis Wiki
Revision as of 03:30, 6 January 2014 by Hmkang (talk | contribs) (Created page with "STOM 2014 Workshop - Practical Sessions 5 == Lecture 5 == The slides describing the notes below are available here (PDF) === Basic Setup ==...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

STOM 2014 Workshop - Practical Sessions 5

Lecture 5

The slides describing the notes below are available here (PDF)

Basic Setup

  • To see the files for the session 5(,6, and 8), type
ls /data/stom2014/session5/

If you see any errors, please let me know now!

  • For convenience, let’s set some variables
export S5=/data/stom2014/session5
  • Also, let's relocate the output directory from the yesterday's class
mv ~/out ~/out_session2
  • And create a new output directory
mkdir ~/out
  • And check the input files
ls $S5/examples/

Preparing Input Files

  • Index file - See the example index file already prepared for this project

cat $S5/examples/index/chr7.CFTR.fastq.index

  • Configuration File - See the example configuration file below.
% cat $S5/examples/index/chr7.CFTR.align.conf�
�INDEX_FILE = index/chr7.CFTR.fastq.index
###################
# References
REF_DIR = chr7Ref
AS = NCBI37
REF = $(REF_DIR)/hs37d5.chr7.fa
DBSNP_VCF =  $(REF_DIR)/dbsnp_135.b37.chr7.CFTR.vcf.gz
HM3_VCF = $(REF_DIR)/hapmap_3.3.b37.sites.chr7.CFTR.vcf.gz

Default options should be mostly fine in many other cases. In this example, because it is not genome-wide calling, reference files are modified to be chr7-specific


Run GotCloud Alignment Pipeline

  • Using the prepared input files, align the FASTQ files using the gotcloud alignment pipeline
$S5/gotcloud/gotcloud align �--conf $S5/examples/index/chr7.CFTR.align.conf�  --outDir ~/out/align --baseprefix $S5/examples
  • Check if the output BAMs and QC metrics are produced
ls ~/out/align/bams�
ls ~/out/align/QCFiles/�

Understanding the Output Files

  • To see the content of BAM file in the format of SAM specification, try
 samtools view -h ~/out/align/bams/NA06984.recal.bam | less
  • Check the summary QC metrics
cat ~/out/align/QCFiles/NA06984.qplot.stats
  • Check whether the sample is contaminated from verifyBamID output
cat ~/out/align/QCFiles/NA06984.genoCheck.selfSM
cat ~/out/align/QCFiles/NA12878.genoCheck.selfSM