Changes

From Genome Analysis Wiki
Jump to navigationJump to search
no edit summary
Line 55: Line 55:     
We looked at them yesterday, but you can take another look at the chromosome 22 reference files included for this tutorial:
 
We looked at them yesterday, but you can take another look at the chromosome 22 reference files included for this tutorial:
  ls ${REF}
+
  ls $REF
    
<ul>
 
<ul>
Line 71: Line 71:     
Look at the BAM index file the alignment pipeline generated
 
Look at the BAM index file the alignment pipeline generated
  cat ${OUT}/bam.index
+
  cat $OUT/bam.index
    
;What is the path to the BAM file for sample HG00640?
 
;What is the path to the BAM file for sample HG00640?
Line 88: Line 88:  
The alignment pipeline only processed 4 samples, but for snpcall, we want to run on 62 samples.
 
The alignment pipeline only processed 4 samples, but for snpcall, we want to run on 62 samples.
 
* The other 58 samples were already aligned:
 
* The other 58 samples were already aligned:
  ls ${IN}/bams
+
  ls $IN/bams
    
Look at the BAM index for those BAMs:
 
Look at the BAM index for those BAMs:
  less ${IN}/bams/bam.index
+
  less $IN/bams/bam.index
    
Remember, use <code>'q'</code> to exit out of <code>less</code>
 
Remember, use <code>'q'</code> to exit out of <code>less</code>
Line 114: Line 114:  
We need to add these BAMs to our index
 
We need to add these BAMs to our index
 
* Append the bam.index from the pre-aligned BAMs to the one you generated from the alignment pipeline
 
* Append the bam.index from the pre-aligned BAMs to the one you generated from the alignment pipeline
  cat ${IN}/bams/bam.index >> ${OUT}/bam.index
+
  cat $IN/bams/bam.index >> $OUT/bam.index
 
* ">>" will append to the file that follows it
 
* ">>" will append to the file that follows it
    
Verify your BAM index contains the additional BAMs
 
Verify your BAM index contains the additional BAMs
   less ${OUT}/bam.index
+
   less $OUT/bam.index
    
Remember, use <code>'q'</code> to exit out of <code>less</code>
 
Remember, use <code>'q'</code> to exit out of <code>less</code>
Line 149: Line 149:     
Now that we have all of our input files, we need just a simple command to run:
 
Now that we have all of our input files, we need just a simple command to run:
  ${GC}/gotcloud snpcall --conf ${IN}/gotcloud.conf --numjobs 4 --region 22:36000000-37000000
+
  $GC/gotcloud snpcall --conf $IN/gotcloud.conf --numjobs 4 --region 22:36000000-37000000
 
* --numjobs tells GotCloud how many jobs to run in parallel
 
* --numjobs tells GotCloud how many jobs to run in parallel
 
** Depends on your system
 
** Depends on your system
Line 169: Line 169:  
== Examining GotCloud SnpCall Output ==
 
== Examining GotCloud SnpCall Output ==
 
Let's look at the output directory:
 
Let's look at the output directory:
  ls ${OUT}
+
  ls $OUT
    
;Do you see any new files or directories?
 
;Do you see any new files or directories?
Line 180: Line 180:     
Let's look at the vcfs directory:
 
Let's look at the vcfs directory:
  ls ${OUT}/vcfs
+
  ls $OUT/vcfs
 
Just a <code>chr22</code> directory, so look inside of there:
 
Just a <code>chr22</code> directory, so look inside of there:
  ls ${OUT}/vcfs/chr22
+
  ls $OUT/vcfs/chr22
 
;Can you identify the final filtered VCF and the associated summary file?
 
;Can you identify the final filtered VCF and the associated summary file?
 
<div class="mw-collapsible mw-collapsed" style="width:350px">
 
<div class="mw-collapsible mw-collapsed" style="width:350px">
Line 196: Line 196:     
Now, let's look in the split directory for the VCF with just the passing variants:
 
Now, let's look in the split directory for the VCF with just the passing variants:
  ls ${OUT}/split/chr22
+
  ls $OUT/split/chr22
 
;Which file do you think is the one you want?
 
;Which file do you think is the one you want?
 
<div class="mw-collapsible mw-collapsed" style="width:350px">
 
<div class="mw-collapsible mw-collapsed" style="width:350px">
Line 210: Line 210:  
=== Filtering Summary Statistics ===
 
=== Filtering Summary Statistics ===
   −
  cat ${OUT}/vcfs/chr22/chr22.filtered.sites.vcf.summary
+
  cat $OUT/vcfs/chr22/chr22.filtered.sites.vcf.summary
    
<div class="mw-collapsible mw-collapsed" style="width:250px">
 
<div class="mw-collapsible mw-collapsed" style="width:250px">
Line 224: Line 224:     
Let's look at the filtered sites file.
 
Let's look at the filtered sites file.
  less -S ${OUT}/vcfs/chr22/chr22.filtered.sites.vcf
+
  less -S $OUT/vcfs/chr22/chr22.filtered.sites.vcf
    
* Scroll down until you find some variants.
 
* Scroll down until you find some variants.
Line 247: Line 247:     
Now, let's look at the filtered file with genotypes.
 
Now, let's look at the filtered file with genotypes.
  zless -S ${OUT}/vcfs/chr22/chr22.filtered.vcf.gz
+
  zless -S $OUT/vcfs/chr22/chr22.filtered.vcf.gz
    
* Scroll down until you find some variants.
 
* Scroll down until you find some variants.
Line 264: Line 264:     
Let's look at the file of just the pass sites:
 
Let's look at the file of just the pass sites:
  zless -S ${OUT}/split/chr22/chr22.filtered.PASS.vcf.gz
+
  zless -S $OUT/split/chr22/chr22.filtered.PASS.vcf.gz
    
* Scroll down: they all look like they <code>PASS</code>
 
* Scroll down: they all look like they <code>PASS</code>
    
Let's check if they are all PASS.
 
Let's check if they are all PASS.
  zcat ${OUT}/split/chr22/chr22.filtered.PASS.vcf.gz |grep -v "^#"| cut -f 7| grep -v "PASS"
+
  zcat $OUT/split/chr22/chr22.filtered.PASS.vcf.gz |grep -v "^#"| cut -f 7| grep -v "PASS"
 
It will return nothing since there are no non-passing variants in this file.
 
It will return nothing since there are no non-passing variants in this file.
 
<div class="mw-collapsible mw-collapsed" style="width:450px">
 
<div class="mw-collapsible mw-collapsed" style="width:450px">
Line 283: Line 283:     
Compare that to the filtered file we looked at before:
 
Compare that to the filtered file we looked at before:
  zcat ${OUT}/vcfs/chr22/chr22.filtered.vcf.gz |grep -v "^#"| cut -f 7| grep -v "PASS"
+
  zcat $OUT/vcfs/chr22/chr22.filtered.vcf.gz |grep -v "^#"| cut -f 7| grep -v "PASS"
 
;Do you see any filters?
 
;Do you see any filters?
 
<div class="mw-collapsible mw-collapsed" style="width:450px">
 
<div class="mw-collapsible mw-collapsed" style="width:450px">
Line 302: Line 302:     
=== Genotype Refinement Input ===
 
=== Genotype Refinement Input ===
The GotCloud genotype refinement pipeline takes as input ${OUT}/split/chr22/chr22.filtered.PASS.vcf.gz (the VCF file of PASS'ing SNPs from snpcall.
+
The GotCloud genotype refinement pipeline takes as input $OUT/split/chr22/chr22.filtered.PASS.vcf.gz (the VCF file of PASS'ing SNPs from snpcall.
    
The bam index and the configuration file we used for GotCloud snpcall will tell GotCloud genotype refinement everything it needs to know, so no new input files need to be prepared.
 
The bam index and the configuration file we used for GotCloud snpcall will tell GotCloud genotype refinement everything it needs to know, so no new input files need to be prepared.
Line 311: Line 311:  
=== Running GotCloud Genotype Refinement ===
 
=== Running GotCloud Genotype Refinement ===
 
Since everything is setup, just run the following command (very similar to snpcall).
 
Since everything is setup, just run the following command (very similar to snpcall).
  ${GC}/gotcloud ldrefine --conf ${IN}/gotcloud.conf --numjobs 2 --region 22:36000000-37000000
+
  $GC/gotcloud ldrefine --conf $IN/gotcloud.conf --numjobs 2 --region 22:36000000-37000000
    
* Beagle will take about 2-3 minutes to complete
 
* Beagle will take about 2-3 minutes to complete
Line 338: Line 338:     
Use tabix to extract that from the VCFs:
 
Use tabix to extract that from the VCFs:
  ${GC}/bin/tabix ${OUT}/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz 22:36907000-36907100 |less -S
+
  $GC/bin/tabix $OUT/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz 22:36907000-36907100 |less -S
    
Remember, type 'q' to quit less.
 
Remember, type 'q' to quit less.
Line 362: Line 362:  
;What is HG00551's genotype at these positions?
 
;What is HG00551's genotype at these positions?
 
#First check which sample number HG00551 is:
 
#First check which sample number HG00551 is:
  $GC/bin/tabix -H ${OUT}/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz
+
  $GC/bin/tabix -H $OUT/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz
 
* That will help you figure out it's genotype.
 
* That will help you figure out it's genotype.
 
* Rerun the tabix command and scroll to find HG00551's genotype:
 
* Rerun the tabix command and scroll to find HG00551's genotype:
  ${GC}/bin/tabix ${OUT}/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz 22:36907000-36907100 |less -S
+
  $GC/bin/tabix $OUT/thunder/chr22/ALL/thunder/chr22.filtered.PASS.beagled.ALL.thunder.vcf.gz 22:36907000-36907100 |less -S
    
<ul>
 
<ul>

Navigation menu