Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 94: Line 94:  
exit PuTTY
 
exit PuTTY
   −
=== Tuesday FEEDBACK! ===
  −
Please provide feedback on the lectures/tutorials from today:
  −
  −
https://docs.google.com/forms/d/1n8xYxvsOq-HsabpDfGcHvwD84BYIRDx8_b-H5N3d-D8/viewform
   
</div>
 
</div>
 
</div>
 
</div>
    
<div class="mw-collapsible mw-collapsed" style="width:500px">
 
<div class="mw-collapsible mw-collapsed" style="width:500px">
 +
 
== Wednesday ==
 
== Wednesday ==
 
<div class="mw-collapsible-content">
 
<div class="mw-collapsible-content">
Line 233: Line 230:  
exit PuTTY
 
exit PuTTY
   −
==== Wednesday FEEDBACK! ====
  −
Please provide feedback on the lectures/tutorials from today:
  −
  −
https://docs.google.com/a/umich.edu/forms/d/1CCHL9ODPsw4jX4hj0kGo6AMHwT4Gam0IpKNRnIR9yMk/viewform
      
</div>
 
</div>
Line 244: Line 237:     
<div class="mw-collapsible mw-collapsed" style="width:500px">
 
<div class="mw-collapsible mw-collapsed" style="width:500px">
 +
 
== Thursday ==
 
== Thursday ==
 
<div class="mw-collapsible-content">
 
<div class="mw-collapsible-content">
Line 302: Line 296:  
</div>
 
</div>
   −
<div class="mw-collapsible" style="width:500px">
+
<div class="mw-collapsible" style="width:1000px">
    
== Friday ==
 
== Friday ==
Line 400: Line 394:  
  ls ~/$SAMPLE/output/vcfs
 
  ls ~/$SAMPLE/output/vcfs
   −
=== More SNP Analysis ===
+
=== Friday : More SNP Analysis ===
    
==== Environmental Variables ====
 
==== Environmental Variables ====
Line 418: Line 412:  
If you want to add rsIDs to your variant files, you can do this by running the following command
 
If you want to add rsIDs to your variant files, you can do this by running the following command
   −
  $HK/vcf-add-rsid -vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --db $HK../data/dbSNP.b138/dbsnp_138.b37.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz
+
  $HK/vcf-add-rsid -vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --db $HK/../data/dbSNP.b138/dbsnp_138.b37.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz
 
   
 
   
 
If you want to run this command across all chromosomes in parallel, you can use the special script run-command-wgs
 
If you want to run this command across all chromosomes in parallel, you can use the special script run-command-wgs
Line 426: Line 420:  
Looking up SNPs by rsID is possible by (for example)
 
Looking up SNPs by rsID is possible by (for example)
 
  $HK/vcf-lookup-rsid --vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --sepchr --rs rs17766217
 
  $HK/vcf-lookup-rsid --vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --sepchr --rs rs17766217
 +
* Be sure to look at the QUAL & your sample's PL, and not just the GL field.  Check if QUAL is 0 or PL is 0,0,0 - NS is also probably 0; DP is probably 0.  That means you probably didn't have any copies, so your GT may not be correct/is unknown.
    
If you want to browse the rsIDs of known GWAS SNPs, you can do this by
 
If you want to browse the rsIDs of known GWAS SNPs, you can do this by
Line 444: Line 439:     
And they can be combined as follows
 
And they can be combined as follows
  (zcat $OUT/vcfs/chr1/chr1.filtered.rsid.anno.exon.vcf.gz; zcat $OUT/vcfs/chr[2-9]/chr*.filtered.rsid.anno.exon.vcf.gz $OUT/vcfs/chr??/chr*.filtered.rsid.anno.exon.vcf.gz $OUT/vcfs/chrX/chrX.filtered.rsid.anno.exon.vcf.gz) | $HK/bgzip -c > $OUT/wgs.filtered.rsid.anno.exon.vcf.gz
+
  (zcat $OUT/vcfs/chr1/chr1.filtered.rsid.anno.exon.vcf.gz; zcat $OUT/vcfs/chr[2-9]/chr*.filtered.rsid.anno.exon.vcf.gz $OUT/vcfs/chr??/chr*.filtered.rsid.anno.exon.vcf.gz $OUT/vcfs/chrX/chrX.filtered.rsid.anno.exon.vcf.gz | grep -v ^#) | $HK/bgzip -c > $OUT/wgs.filtered.rsid.anno.exon.vcf.gz
+
 
 
==== Exonic Variants NOT found by 1000G ====
 
==== Exonic Variants NOT found by 1000G ====
    
If you are interested in rare variants that are not identified by 1000G, you can extract them by running
 
If you are interested in rare variants that are not identified by 1000G, you can extract them by running
  zcat $OUT/wgs.filtered.rsid.anno.exon.vcf.gz | grep "EXTFILTER=NA,NA" | grep -v -w "0/0" | grep -v "less
+
  zcat $OUT/wgs.filtered.rsid.anno.exon.vcf.gz | grep "EXTFILTER=NA,NA" | grep -v -w "0/0" | less
 
   
 
   
 
For example,  
 
For example,  
Line 459: Line 454:  
* Q2. Looking at each functional category, which functional categories has largest fraction of SNPs failed filter? Why do you think it is?
 
* Q2. Looking at each functional category, which functional categories has largest fraction of SNPs failed filter? Why do you think it is?
 
* Q3. Can you exclude the sites that are also in dbSNP, and count how many nonsense variants are left?
 
* Q3. Can you exclude the sites that are also in dbSNP, and count how many nonsense variants are left?
 +
 +
 +
To also exclude those in dbsnp:
 +
zcat $OUT/wgs.filtered.rsid.anno.exon.vcf.gz | grep "EXTFILTER=NA,NA" | grep -v -w "0/0" | grep -v rs| perl -lane 'print "$1\t$F[6]" if ( /ANNO=([^;:]+)/)' | sort | uniq -c
 +
 +
Exclude dbsnp and look at Stop_Gain variants
 +
zcat $OUT/wgs.filtered.rsid.anno.exon.vcf.gz | grep "EXTFILTER=NA,NA" | grep -v -w "0/0" |grep -v rs | perl -lane 'print "$_" if ( /ANNO=Stop_Gain/)' |grep -w PASS
 +
 +
Want to see this from the BAM file?  Use samtools tview:
 +
$GC/bin/samtools tview $SAMPLE/output/bams/$SAMPLE.recal.bam $GC/gotcloud.ref/human.g1k.v37.fa
 +
Use 'g' & enter the Chr:Pos
 +
* Some patterns may indicate not real variants.
    
If you want to know predicted functional significance of a particular variant, you can search by
 
If you want to know predicted functional significance of a particular variant, you can search by
Line 467: Line 474:       −
===== WORKSHOP FEEDBACK! =====
  −
Please provide feedback on the workshop in general:
     −
https://docs.google.com/forms/d/1f8HjTKvxgYuApl9dLNqbW9Su3N3v9jtMpEDcbPJ1PYk/viewform
   
</div>
 
</div>
 
</div>
 
</div>

Navigation menu