Difference between revisions of "Sequence Analysis Practice 2011/03/09"
From Genome Analysis Wiki
Jump to navigationJump to searchLine 43: | Line 43: | ||
5. Mark Deuplicate Reads | 5. Mark Deuplicate Reads | ||
− | ${BIN}/superDeDuper -i ${OUT}/NA12878.exon.sample.merged.bam | + | ${BIN}/superDeDuper -i ${OUT}/NA12878.exon.sample.merged.bam -o ${OUT}/NA12878.exon.sample.deduped.bam -v |
− | |||
6. Visualize alignment to reference genome | 6. Visualize alignment to reference genome | ||
− | ${BIN}/samtools-hybrid tview ${OUT}/NA12878.exon.sample.deduped.bam | + | ${BIN}/samtools-hybrid tview ${OUT}/NA12878.exon.sample.deduped.bam ${REF}/human_g1k_v37_chr20.fa |
− |
Revision as of 17:17, 9 March 2011
Overview
Below lists a sequence of practice mapping fastq files to bam files, performing variant calling and variout quality checks.
Steps
0. SETTING UP ENVIRONMENTAL VARIABLES
setenv BIN /home/hyun/wed/bin setenv IN /home/hyun/wed/input setenv REF /home/hyun/wed/ref setenv OUT ~/seq/wednesday/output mkdir --p ${OUT}
1. Understanding FASTQ format
zcat ${IN}/NA12878.exon.sample.read1.fastq.gz | head zcat ${IN}/NA12878.exon.sample.read2.fastq.gz | head zcat ${IN}/NA12878.exon.sample.unpaired.fastq.gz | head
press q to quit
2. Align using BWA
${BIN}/bwa aln -q 15 ${REF}/human_g1k_v37_chr20.fa ${IN}/NA12878.exon.sample.read1.fastq.gz > ${OUT}/NA12878.exon.sample.read1.fastq.gz.sai ${BIN}/bwa aln -q 15 ${REF}/human_g1k_v37_chr20.fa ${IN}/NA12878.exon.sample.read2.fastq.gz > ${OUT}/NA12878.exon.sample.read2.fastq.gz.sai ${BIN}/bwa aln -q 15 ${REF}/human_g1k_v37_chr20.fa ${IN}/NA12878.exon.sample.unpaired.fastq.gz > ${OUT}/NA12878.exon.sample.unpaired.fastq.gz.sai
${BIN}/bwa samse ${REF}/human_g1k_v37_chr20.fa ${OUT}/NA12878.exon.sample.unpaired.fastq.gz.sai ${IN}/NA12878.exon.sample.unpaired.fastq.gz | ${BIN}/samtools-hybrid view -uhS - | ${BIN}/samtools-hybrid sort -m 10000000 - ${OUT}/NA12878.exon.sample.unpaired.bwa.sorted ${BIN}/bwa sampe ${REF}/human_g1k_v37_chr20.fa ${OUT}//NA12878.exon.sample.read1.fastq.gz.sai ${OUT}/NA12878.exon.sample.read2.fastq.gz.sai ${IN}/NA12878.exon.sample.read1.fastq.gz ${IN}/NA12878.exon.sample.read2.fastq.gz | ${BIN}/samtools-hybrid view -uhS - | ${BIN}/samtools-hybrid sort -m 10000000 - ${OUT}/NA12878.exon.sample.paired.bwa.sorted
3. Merge multiple BAMs into one
${BIN}/samtools-hybrid merge ${OUT}/NA12878.exon.sample.merged.bam ${OUT}/NA12878.exon.sample.paired.bwa.sorted.bam ${OUT}/NA12878.exon.sample.unpaired.bwa.sorted.bam
4. View SAM/BAM format
${BIN}/samtools-hybrid view -h ${OUT}/NA12878.exon.sample.merged.bam | head -5
5. Mark Deuplicate Reads
${BIN}/superDeDuper -i ${OUT}/NA12878.exon.sample.merged.bam -o ${OUT}/NA12878.exon.sample.deduped.bam -v
6. Visualize alignment to reference genome
${BIN}/samtools-hybrid tview ${OUT}/NA12878.exon.sample.deduped.bam ${REF}/human_g1k_v37_chr20.fa