Line 7: |
Line 7: |
| </div> | | </div> |
| | | |
− | <div class="mw-collapsible" style="width:500px"> | + | <div class="mw-collapsible mw-collapsed" style="width:500px"> |
| == Tuesday - Start SNP Calling == | | == Tuesday - Start SNP Calling == |
| <div class="mw-collapsible-content"> | | <div class="mw-collapsible-content"> |
Line 43: |
Line 43: |
| * Relative path, so assumes running from your home directory (I prefer absolute paths, but for simplicity of the workshop, we just use relative path). | | * Relative path, so assumes running from your home directory (I prefer absolute paths, but for simplicity of the workshop, we just use relative path). |
| | | |
− | === Configuring Indel === | + | === Configuring SNPCALL === |
− | No special Configuration settings for SNP calling
| + | |
| cat ~/$SAMPLE/gotcloud.conf | | cat ~/$SAMPLE/gotcloud.conf |
| | | |
Line 104: |
Line 104: |
| <div class="mw-collapsible mw-collapsed" style="width:500px"> | | <div class="mw-collapsible mw-collapsed" style="width:500px"> |
| | | |
− | == Wednesday == | + | == Thursday == |
| <div class="mw-collapsible-content"> | | <div class="mw-collapsible-content"> |
| | | |
− | <div class="mw-collapsible" style="width:500px">
| |
| === Checking if snpcall Completed === | | === Checking if snpcall Completed === |
− | <div class="mw-collapsible-content">
| + | ==== Resume screen to Check Jobs ==== |
− | ==== Logging Back in to Check Jobs ==== | |
| | | |
| ;How do you log back into screen? | | ;How do you log back into screen? |
| screen -r | | screen -r |
| This will resume an already running screen. | | This will resume an already running screen. |
| + | |
| + | Your screen session still has your environment variables set, so you do not need to reset them. |
| + | |
| | | |
| Verify you got a "completed successfully" message. | | Verify you got a "completed successfully" message. |
Line 120: |
Line 121: |
| How long did snpcall calling take? Look at the log message - time in seconds. | | How long did snpcall calling take? Look at the log message - time in seconds. |
| | | |
− | === Detach From screen===
| |
− | Detach from screen. We will resume it again later when we start SNPCall.
| |
− | Ctrl-a d
| |
− |
| |
− | </div>
| |
− | </div>
| |
− |
| |
− | === INDEL Tutorial ===
| |
− | Now we are going to run the INDEL Practical
| |
− |
| |
− | Please go to: [[SeqShop: Variant Calling and Filtering for INDELs Practical, May 2015]]
| |
− |
| |
− | We will Start INDEL Calling after the practical.
| |
− |
| |
− |
| |
− | <div class="mw-collapsible" style="width:500px">
| |
− |
| |
− | === Start INDEL Calling ===
| |
− | <div class="mw-collapsible-content">
| |
− | ==== Resume screen ====
| |
− |
| |
− | ;How do you log back into screen?
| |
− | screen -r
| |
− | This will resume an already running screen.
| |
− |
| |
− | Your screen session still has your environment variables set, so you do not need to reset them.
| |
| | | |
| ==== List of BAMs ==== | | ==== List of BAMs ==== |
Line 155: |
Line 130: |
| :<code>SampleXX SampleXX/output/bams/SampleXX.recal.bam</code> | | :<code>SampleXX SampleXX/output/bams/SampleXX.recal.bam</code> |
| * Relative path, so assumes running from your home directory (I prefer absolute paths, but for simplicity of the workshop, we just use relative path). | | * Relative path, so assumes running from your home directory (I prefer absolute paths, but for simplicity of the workshop, we just use relative path). |
| + | |
| | | |
| ==== GotCloud INDEL Configuration ==== | | ==== GotCloud INDEL Configuration ==== |
Line 160: |
Line 136: |
| cat ~/$SAMPLE/gotcloud.conf | | cat ~/$SAMPLE/gotcloud.conf |
| | | |
− | Same as it looked yesterday with no special Configuration settings for INDEL calling. | + | Same as it looked the other day with no special Configuration settings for INDEL calling. |
| | | |
| ==== Running INDEL ==== | | ==== Running INDEL ==== |
− | Run GotCloud indel with 8 jobs running in parallel | + | Run GotCloud indel with 6 jobs running in parallel |
− | * Why 8?
| + | ${GC}/gotcloud indel --conf $SAMPLE/gotcloud.conf --numjobs 6 --outdir $OUT |
− | ** You want to run as many as you can.
| |
− | ** 3 of you on the machine - 3*8 = 24 jobs will be running in parallel on that machine
| |
− | ${GC}/gotcloud indel --conf $SAMPLE/gotcloud.conf --numjobs 8 | |
| * Only need the configuration, number of threads, and the output directory, rest is specified within the configuration. | | * Only need the configuration, number of threads, and the output directory, rest is specified within the configuration. |
| | | |
Line 181: |
Line 154: |
| exit PuTTY | | exit PuTTY |
| | | |
| + | === FEEDBACK!=== |
| + | Please provide feedback for today. |
| + | https://docs.google.com/a/umich.edu/forms/d/1iES6usHxLB7Ec9hRxtqYgH7v05lU3Ume4VJcksx8Ogg/viewform |
| | | |
| </div> | | </div> |
| </div> | | </div> |
− | </div>
| |
− | </div>
| |
− |
| |
− | <div class="mw-collapsible mw-collapsed" style="width:500px">
| |
− |
| |
− | == Thursday ==
| |
− | <div class="mw-collapsible-content">
| |
− | === Checking if SnpCall Completed ===
| |
− | ==== Logging Back in to Check Jobs ====
| |
− |
| |
− | ;How do you log back into screen?
| |
− | screen -r
| |
− | This will resume an already running screen.
| |
− |
| |
− | ==== Checking Completion ====
| |
− |
| |
− | Did you get a "completed successfully" message?
| |
− |
| |
− | If yes, how long did SNP calling take? Look at the log message - time in seconds.
| |
− |
| |
− | If no, are you running on seqshop-server? If so, KILL it. ssh -X to one of the seqshop machines and run on there.
| |
− | Ctrl-c
| |
− |
| |
− | Detach from screen. We will resume it again later when we restart SNPCall (if necessary).
| |
− | Ctrl-a d
| |
− |
| |
− | === Ancestry Tutorial ===
| |
− | Now we are going to run the Structural Variation Practical
| |
− |
| |
− | Please go to: [[SeqShop: Estimates of Genetic Ancestry Practical, May 2015]]
| |
− |
| |
− | We will Resume SNP Calling (if necessary) after the practical.
| |
− |
| |
− | === Restart SnpCall ===
| |
− | ==== Resume screen ====
| |
− |
| |
− | ;How do you log back into screen?
| |
− | screen -r
| |
− | This will resume an already running screen.
| |
− |
| |
− | Your screen session still has your environment variables set, so you do not need to reset them.
| |
− |
| |
− | ==== Running SnpCall ====
| |
− | Run GotCloud snpcall with 6 jobs running in parallel
| |
− | ${GC}/gotcloud snpcall --conf $SAMPLE/gotcloud.conf --numjobs 6
| |
− | * GotCloud will pick up after the last completed step from before.
| |
− |
| |
− | This will run overnight. We will check if it completed at the practical in the morning.
| |
− |
| |
− | ==== Log Out ====
| |
− | ;Want to log out and leave your job running?
| |
− | In the screen window, type:
| |
− | Ctrl-a d
| |
− | (Hold down Ctrl and type 'a', let go of both and type 'd')
| |
− | * This will "detach" from your screen session while your alignment continues to run.
| |
− |
| |
− | exit PuTTY
| |
− |
| |
− |
| |
| </div> | | </div> |
| </div> | | </div> |
Line 249: |
Line 166: |
| | | |
| == Friday == | | == Friday == |
− | <div class="mw-collapsible-content mw-collapsed"> | + | <div class="mw-collapsible-content"> |
| | | |
| | | |
Line 255: |
Line 172: |
| [[SeqShop: Ancestry On Your Own Genome, May 2015]] | | [[SeqShop: Ancestry On Your Own Genome, May 2015]] |
| | | |
− | === Association Analysis Tutorial === | + | === Setup Variables === |
− | Now we are going to run the Association Analysis Practical
| + | Set these values. Also, be sure to specify your sample name instead of SampleXX |
| + | export SAMPLE=SampleXX |
| + | or |
| + | export SAMPLE=NA12878 |
| | | |
− | Please go to: [[SeqShop: Association Analysis, May 2015]]
| + | Point to your GotCloud & your output directory: |
− | | + | export GC=~/seqshop/gotcloud |
− | We will look at our own genomes again after the practical.
| + | export OUT=~/$SAMPLE/output |
− | | |
− | === Return to SeqShop: Ancestry On Your Own Genome, May 2015 ===
| |
− | Return to [[SeqShop:_Ancestry_On_Your_Own_Genome,_May_2015#Checking_if_Pileup_finished]]
| |
| | | |
| === Reviewing Indel Results === | | === Reviewing Indel Results === |
− | Set these values. Also, be sure to specify your sample name (or NA12878) instead of SampleXX
| |
− | export SAMPLE=SampleXX
| |
− | source /net/seqshop-server/home/mktrost/seqshop/setupSS.txt
| |
− |
| |
| Look in the output directory | | Look in the output directory |
| ls ~/$SAMPLE/output | | ls ~/$SAMPLE/output |
Line 334: |
Line 247: |
| The insertion deletion ratio increases from 0.91 to 0.92. | | The insertion deletion ratio increases from 0.91 to 0.92. |
| | | |
| + | === Return to SeqShop: Ancestry On Your Own Genome, May 2015 === |
| + | Return to [[SeqShop:_Ancestry_On_Your_Own_Genome,_May_2015#Checking_if_Pileup_finished]] |
| | | |
| === Friday: Reviewing SNPCALL Results === | | === Friday: Reviewing SNPCALL Results === |
Line 343: |
Line 258: |
| | | |
| === Friday : More SNP Analysis === | | === Friday : More SNP Analysis === |
| + | In addition, set another environmental variable for locating the binaries for custom analysis |
| | | |
− | ==== Environmental Variables ==== | + | export HK=/net/seqshop-server/home/hmkang/apigenome/bin |
| + | export EPACTS=/net/seqshop-server/home/mktrost/seqshop/epacts/ |
| + | export REF=/net/seqshop-server/home/mktrost/seqshop/singleSample/ref/gotcloud.ref |
| + | export GC=~/seqshop/gotcloud |
| | | |
− | If you didn't set the environmental variable, you can set it again
| |
| | | |
− | source /net/seqshop-server/home/mktrost/seqshop/setup.txt
| + | export SAMPLE=SampleXX |
− | export SAMPLE=SampleXX (MAKE SURE TO CHANGE XX to your number or use NA12878 instead) | + | export OUT=~/$SAMPLE/output |
− | source /net/seqshop-server/home/mktrost/seqshop/setupSS.txt
| |
− | | |
− | In addition, set another environmental variable for locating the binaries for custom analysis
| |
− | | |
− | export HK=/net/seqshop-server/home/hmkang/seqshop/bin | |
| | | |
| ==== Annotation / Lookup against dbSNP ==== | | ==== Annotation / Lookup against dbSNP ==== |
Line 360: |
Line 273: |
| If you want to add rsIDs to your variant files, you can do this by running the following command | | If you want to add rsIDs to your variant files, you can do this by running the following command |
| | | |
− | $HK/vcf-add-rsid -vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --db $HK/../data/dbSNP.b138/dbsnp_138.b37.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz | + | $HK/vcf-add-rsid -vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --db $HK/../data/dbsnp_142.b37.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz |
| | | |
| If you want to run this command across all chromosomes in parallel, you can use the special script run-command-wgs | | If you want to run this command across all chromosomes in parallel, you can use the special script run-command-wgs |
| | | |
− | $HK/run-command-wgs --cmd "$HK/vcf-add-rsid -vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --db $HK/../data/dbSNP.b138/dbsnp_138.b37.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz" --numjobs 6 | + | $HK/run-make --repeat-chr --cmd "$HK/vcf-add-rsid -vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --db $HK/../data/dbsnp_142.b37.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz" --numjobs 6 --out runmake.rsid |
| | | |
− | Looking up SNPs by rsID is possible by (for example) | + | Looking up SNPs by rsID is possible by (for example, rs17766217) -- How can we find its position? |
− | $HK/vcf-lookup-rsid --vcf $OUT/vcfs/chr1/chr1.filtered.vcf.gz --sepchr --rs rs17766217 | + | $HK/tabix $OUT/vcfs/chr8/chr8.filtered.rsid.vcf.gz 8:128504497 | less |
| * Be sure to look at the QUAL & your sample's PL, and not just the GL field. Check if QUAL is 0 or PL is 0,0,0 - NS is also probably 0; DP is probably 0. That means you probably didn't have any copies, so your GT may not be correct/is unknown. | | * Be sure to look at the QUAL & your sample's PL, and not just the GL field. Check if QUAL is 0 or PL is 0,0,0 - NS is also probably 0; DP is probably 0. That means you probably didn't have any copies, so your GT may not be correct/is unknown. |
| | | |
| If you want to browse the rsIDs of known GWAS SNPs, you can do this by | | If you want to browse the rsIDs of known GWAS SNPs, you can do this by |
− | cut -f 1,8,22 $HK/../data/gwascatalog/gwascatalog.txt | less | + | cut -f 1,8,12,13,22 $HK/../data/gwascatalog/gwascatalog.txt | grep -w rs17766217 |
| | | |
| ==== Annotating your genome ==== | | ==== Annotating your genome ==== |
Line 379: |
Line 292: |
| | | |
| Or you can run multiple chromosomes in parallel in one command | | Or you can run multiple chromosomes in parallel in one command |
− | $HK/run-command-wgs --cmd "$EPACTS/bin/epacts anno --in $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.anno.vcf.gz" --numjobs 6 | + | $HK/run-make --repeat-chr --cmd "$EPACTS/bin/epacts anno --in $OUT/vcfs/chr1/chr1.filtered.rsid.vcf.gz --out $OUT/vcfs/chr1/chr1.filtered.rsid.anno.vcf.gz" --numjobs 6 --out runmake.anno |
− |
| + | |
| ==== Extracting only exonic SNPs ==== | | ==== Extracting only exonic SNPs ==== |
| | | |
| If you want to look at the exonic SNPs, you can extract using the following command | | If you want to look at the exonic SNPs, you can extract using the following command |
− | $HK/run-command-wgs --cmd "($HK/tabix -H $OUT/vcfs/chr1/chr1.filtered.rsid.anno.vcf.gz; zcat $OUT/vcfs/chr1/chr1.filtered.rsid.anno.vcf.gz | grep Exon;)| $HK/bgzip -c > $OUT/vcfs/chr1/chr1.filtered.rsid.anno.exon.vcf.gz" --numjobs 6 | + | $HK/run-make --repeat-chr --cmd "($HK/tabix -H $OUT/vcfs/chr1/chr1.filtered.rsid.anno.vcf.gz; zcat $OUT/vcfs/chr1/chr1.filtered.rsid.anno.vcf.gz | grep Exon;)| $HK/bgzip -c > $OUT/vcfs/chr1/chr1.filtered.rsid.anno.exon.vcf.gz" --numjobs 6 --out runmake.exome |
| | | |
| And they can be combined as follows | | And they can be combined as follows |
| (zcat $OUT/vcfs/chr1/chr1.filtered.rsid.anno.exon.vcf.gz; zcat $OUT/vcfs/chr[2-9]/chr*.filtered.rsid.anno.exon.vcf.gz $OUT/vcfs/chr??/chr*.filtered.rsid.anno.exon.vcf.gz $OUT/vcfs/chrX/chrX.filtered.rsid.anno.exon.vcf.gz | grep -v ^#) | $HK/bgzip -c > $OUT/wgs.filtered.rsid.anno.exon.vcf.gz | | (zcat $OUT/vcfs/chr1/chr1.filtered.rsid.anno.exon.vcf.gz; zcat $OUT/vcfs/chr[2-9]/chr*.filtered.rsid.anno.exon.vcf.gz $OUT/vcfs/chr??/chr*.filtered.rsid.anno.exon.vcf.gz $OUT/vcfs/chrX/chrX.filtered.rsid.anno.exon.vcf.gz | grep -v ^#) | $HK/bgzip -c > $OUT/wgs.filtered.rsid.anno.exon.vcf.gz |
| + | $HK/tabix -pvcf $OUT/wgs.filtered.rsid.anno.exon.vcf.gz |
| | | |
| ==== Exonic Variants NOT found by 1000G ==== | | ==== Exonic Variants NOT found by 1000G ==== |
Line 411: |
Line 325: |
| | | |
| Want to see this from the BAM file? Use samtools tview: | | Want to see this from the BAM file? Use samtools tview: |
− | $GC/bin/samtools tview $SAMPLE/output/bams/$SAMPLE.recal.bam $GC/gotcloud.ref/human.g1k.v37.fa | + | $GC/bin/samtools tview $SAMPLE/output/bams/$SAMPLE.recal.bam $REF/hs37d5.fa |
| Use 'g' & enter the Chr:Pos | | Use 'g' & enter the Chr:Pos |
| * Some patterns may indicate not real variants. | | * Some patterns may indicate not real variants. |
Line 420: |
Line 334: |
| | | |
| The phred score at the last column quantifies the degree of functional significance | | The phred score at the last column quantifies the degree of functional significance |
− |
| |
| | | |
| | | |
| </div> | | </div> |
| </div> | | </div> |
| + | |
| + | == OVERALL COURSE FEEDBACK! == |
| + | Please provide feedback: |
| + | https://docs.google.com/forms/d/1pxfPXKwWfA71ZJM99Sevs3MwAUz2UbHAR8dnRI-kRNM/viewform |