Line 37: |
Line 37: |
| | | |
| #Rebuild the reference | | #Rebuild the reference |
− | bwa index -a is ref/human_g1k_v37_chr20.fa | + | bin/bwa index -a is ref/human_g1k_v37_chr20.fa |
| </source> | | </source> |
| | | |
Line 44: |
Line 44: |
| Next, we will use BWA to find the most likely sequence location for each read using the <code>bwa aln</code> command. This command requires two parameters, one corresponding to the reference genome, the other corresponding to a fastq file containing reads to be mapped. | | Next, we will use BWA to find the most likely sequence location for each read using the <code>bwa aln</code> command. This command requires two parameters, one corresponding to the reference genome, the other corresponding to a fastq file containing reads to be mapped. |
| | | |
− | bwa aln -q 15 ref/human_g1k_v37_chr20.fa fastq/SAMPLE1.fastq > bwa.sai/SAMPLE1.sai | + | bin/bwa aln -q 15 ref/human_g1k_v37_chr20.fa fastq/SAMPLE1.fastq > bwa.sai/SAMPLE1.sai |
| | | |
| The file SAMPLE1.fastq contains DNA sequence reads for sample SAMPLE1. | | The file SAMPLE1.fastq contains DNA sequence reads for sample SAMPLE1. |
Line 54: |
Line 54: |
| The .sai alignment format is specific to BWA, so the first thing to do is to convert the alignment to a more standard format that will be compatible with downstream analysis tools. We can do this with a combination of the <code>bwa samse</code> command and <code>samtools view</code> and <code>samtoosl sort</code> commands. | | The .sai alignment format is specific to BWA, so the first thing to do is to convert the alignment to a more standard format that will be compatible with downstream analysis tools. We can do this with a combination of the <code>bwa samse</code> command and <code>samtools view</code> and <code>samtoosl sort</code> commands. |
| | | |
− | bwa samse -r "@RG\tID:ILLUMINA\tSM:SAMPLE1" ref/human_g1k_v37_chr20.fa bwa.sai/SAMPLE1.sai fastq/SAMPLE1.fastq | \ | + | bin/bwa samse -r "@RG\tID:ILLUMINA\tSM:SAMPLE1" ref/human_g1k_v37_chr20.fa bwa.sai/SAMPLE1.sai fastq/SAMPLE1.fastq | \ |
− | samtools view -uhS - | samtools sort -m 2000000000 - bams/SAMPLE1 | + | bin/samtools view -uhS - | samtools sort -m 2000000000 - bams/SAMPLE1 |
| | | |
| You can check the use of parameters in the bwa manual. The result BAM file uses a compact binary format to represent the | | You can check the use of parameters in the bwa manual. The result BAM file uses a compact binary format to represent the |
Line 61: |
Line 61: |
| of the file using the <code>samtools view</code> command, like so: | | of the file using the <code>samtools view</code> command, like so: |
| | | |
− | samtools view bams/SAMPLE1.bam | more | + | bin/samtools view bams/SAMPLE1.bam | more |
| | | |
| The text representation of the alignment produced by <code>samtools view</code> describes | | The text representation of the alignment produced by <code>samtools view</code> describes |
Line 85: |
Line 85: |
| genome location. We do this with the <code>samtools index</code> command, like so: | | genome location. We do this with the <code>samtools index</code> command, like so: |
| | | |
− | samtools index bams/SAMPLE1.bam | + | bin/samtools index bams/SAMPLE1.bam |
| | | |
| === Browsing Alignment Results === | | === Browsing Alignment Results === |
Line 94: |
Line 94: |
| starting at position 2,100,000 on chromosome 20, we could run: | | starting at position 2,100,000 on chromosome 20, we could run: |
| | | |
− | samtools tview bams/SAMPLE1.bam ref/human_g1k_v37_chr20.fa | + | bin/samtools tview bams/SAMPLE1.bam ref/human_g1k_v37_chr20.fa |
| | | |
| Then, type "g 20:2100000" | | Then, type "g 20:2100000" |
Line 108: |
Line 108: |
| foreach file (`ls fastq/SAMPLE*.fastq | cut -f 2 -d '/' | cut -f 1 -d '.'`) | | foreach file (`ls fastq/SAMPLE*.fastq | cut -f 2 -d '/' | cut -f 1 -d '.'`) |
| echo $file | | echo $file |
− | bwa aln -q 15 ref/human_g1k_v37_chr20.fa fastq/$file.fastq > bwa.sai/$file.sai | + | bin/bwa aln -q 15 ref/human_g1k_v37_chr20.fa fastq/$file.fastq > bwa.sai/$file.sai |
− | bwa samse -r "@RG\tID:ILLUMINA\tSM:$file" ref/human_g1k_v37_chr20.fa bwa.sai/$file.sai fastq/$file.fastq | \ | + | bin/bwa samse -r "@RG\tID:ILLUMINA\tSM:$file" ref/human_g1k_v37_chr20.fa bwa.sai/$file.sai fastq/$file.fastq | \ |
− | samtools view -uhS - | bin/samtools sort -m 2000000000 - bams/$file | + | bin/samtools view -uhS - | bin/samtools sort -m 2000000000 - bams/$file |
− | samtools index bams/$file.bam | + | bin/samtools index bams/$file.bam |
| end | | end |
| | | |
Line 118: |
Line 118: |
| for file in `ls fastq/SAMPLE*.fastq | cut -f 2 -d '/' | cut -f 1 -d '.'`; | | for file in `ls fastq/SAMPLE*.fastq | cut -f 2 -d '/' | cut -f 1 -d '.'`; |
| do echo $file; | | do echo $file; |
− | bwa aln -q 15 ref/human_g1k_v37_chr20.fa fastq/$file.fastq > bwa.sai/$file.sai; | + | bin/bwa aln -q 15 ref/human_g1k_v37_chr20.fa fastq/$file.fastq > bwa.sai/$file.sai; |
− | bwa samse -r "@RG\tID:ILLUMINA\tSM:$file" ref/human_g1k_v37_chr20.fa bwa.sai/$file.sai fastq/$file.fastq | \ | + | bin/bwa samse -r "@RG\tID:ILLUMINA\tSM:$file" ref/human_g1k_v37_chr20.fa bwa.sai/$file.sai fastq/$file.fastq | \ |
− | samtools view -uhS - | samtools sort -m 2000000000 - bams/$file; | + | bin/samtools view -uhS - | samtools sort -m 2000000000 - bams/$file; |
− | samtools index bams/$file.bam; | + | bin/samtools index bams/$file.bam; |
| done | | done |
| | | |
Line 137: |
Line 137: |
| | | |
| | | |
− | samtools mpileup -Iuf ref/human_g1k_v37_chr20.fa bams/SAMPLE*bam | bcftools view -bvcg - > result/chr20.mpileup.bcf | + | bin/samtools mpileup -Iuf ref/human_g1k_v37_chr20.fa bams/SAMPLE*bam | bcftools view -bvcg - > result/chr20.mpileup.bcf |
| | | |
− | bcftools view result/chr20.mpileup.bcf > result/chr20.mpileup.vcf | + | bin/bcftools view result/chr20.mpileup.bcf > result/chr20.mpileup.vcf |
| | | |
| | | |
Line 215: |
Line 215: |
| To complete our example analysis, we could run: | | To complete our example analysis, we could run: |
| | | |
− | TrioCaller --vcf result/chr20.mpileup.vcf --pedfile ped/triocaller.ped --states 50 --rounds 10 --prefix result/chr20.triocaller | + | bin/TrioCaller --vcf result/chr20.mpileup.vcf --pedfile ped/triocaller.ped --states 50 --rounds 10 --prefix result/chr20.triocaller |
| | | |
| The format of output file is same as the input file. Again, you can review the contents of the updated VCF file using the more command: | | The format of output file is same as the input file. Again, you can review the contents of the updated VCF file using the more command: |