Changes

From Genome Analysis Wiki
Jump to navigationJump to search
5,438 bytes added ,  12:55, 31 October 2014
no edit summary
Line 1: Line 1:  
__TOC__
 
__TOC__
   −
== Initial Setup ==
+
Back to [[GotCloud]]
After creating an amazon account (https://aws.amazon.com/), you can run the GotCloud demo on Amazon.
     −
== Starting GotCloud AMI ==
+
Back to [[GotCloud: Amazon]]
The GotCloud AMI has the Demo data built into it.
+
 
 +
== Introduction ==
 +
This Amazon demo runs through the GotCloud SNP and INDEL calling pipelines.
 +
 
 +
The data used for this demo is originally from our sequencing workshop demos. We also have alignment and structural variation demos.
 +
 
 +
Links to the general GotCloud Demos (originally from our sequencing workshop):
 +
* [[SeqShop: Sequence Mapping and Assembly Practical]]
 +
* [[SeqShop: Variant Calling and Filtering for SNPs Practical]]
 +
* [[SeqShop: Variant Calling and Filtering for INDELs Practical]]
 +
* [[SeqShop: Analysis of Structural Variation Practical]]
 +
 
 +
 
 +
To run this Demo on an Amazon Cluster rather than on a single node, see: [[StarCluster#Run_GotCloud_Demo_Using_StarCluster|StarCluster -> Run GotCloud Demo Using StarCluster]]
 +
 
 +
 
 +
== Starting up a Node ==
 +
See [[Amazon Single Node]] for instructions on starting a node and getting a terminal running.
 +
* For the demo, we recommend using a <code>c3.2xlarge</code> instance.
 +
 
 +
== Running the Demo on Already Running Node ==
 +
{{GotCloud: Amazon Demo Setup}}
 +
 
 +
=== Run GotCloud SnpCall ===
 +
Now that we have examined the instance files, run GotCloud snpcall
 +
# <pre>gotcloud snpcall --conf example/test.conf --outdir output --numjobs 8</pre>
 +
#* The ubuntu user is setup to have the gotcloud program and tools in its path, so you can just type the program name and it will be found
 +
#: [[File:RunSnpCall.png|700px]]
 +
#:* This will take a few minutes to run.
 +
#:* GotCloud first generates a makefile, and then runs the makefile
 +
#:* After a while GotCloud snpcall will print some messages to the screen. This is expected and ok.
 +
# When complete, GotCloud snpcall will indicate success/failure
 +
#:[[File:SnpcallSuccess.png|700px]]
 +
 
 +
==== Examining SnpCall Output ====
 +
# ls on the ubuntu home directory to see the new output directory:
 +
#:<pre>ls</pre>
 +
#:[[File:SnpCallOutput1.png|500px]]
 +
# Look inside the output directory:
 +
#:<pre>ls output</pre>
 +
#:[[File:SnpCallOutput2.png|700px]]
 +
#:#<code>glfs</code> - intermediate per sample genotype likelihood files
 +
#:#<code>jobfiles</code> - empty; was used to store commands as GotCloud was running
 +
#:#<code>pvcfs</code> - intermediate vcfs with per sample information
 +
#:#<code>split</code> - contains per chromosome directories with vcfs containing PASS only snps split up as required for beagle (part of ldrefine)
 +
#:#<code>target</code> - contains the bed with the region to be processed
 +
#:#<code>umake.snpcall.conf</code> - file containing all of the configuration settings used for this run of GotCloud
 +
#:#<code>umake.snpcall.Makefile</code> - Makefile containing commands for this run of GotCloud
 +
#:#<code>umake.snpcall.Makefile.cluster</code> - Makefile log of start/stop times of various steps
 +
#:#<code>umake.snpcall.Makefile.log</code> - log of the GotCloud run
 +
#:#<code>vcfs</code> - contains per chromosome directories with vcfs
 +
#:#* important output is
 +
#:#*# filtered.vcf.gz file: <code>vcfs/chr22/chr22.filtered.vcf.gz</code>
 +
#:#*# summary information: <code>vcfs/chr22/chr22.filtered.sites.vcf.summary</code>
 +
# Look at the filtered file.
 +
#: <pre>zless -S output/vcfs/chr22/chr22.filtered.vcf.gz</pre>
 +
#: Scroll down some: [[File:SnpVcfOutput.png|700px]]
 +
## Check that GotCloud found the expected SNP at 22:36661906
 +
##:<pre>tabix output/vcfs/chr22/chr22.filtered.vcf.gz 22:36661906 |head -1</pre>
 +
#:[[File:SnpTabix.png|700px]]
 +
# Look at the summary information
 +
#:<pre>cat output/vcfs/chr22/chr22.filtered.sites.vcf.summary</pre>
 +
#:[[File:SnpSummary.png|700px]]
 +
#:*'''To understand how to interpret the filtering summary statistics, please refer to [[Understanding vcf-summary output]]'''
 +
 
 +
 
 +
=== Run GotCloud Indel ===
 +
Now that we have examined the instance files, run GotCloud indel
 +
# <pre>gotcloud indel --conf example/test.conf --outdir output --numjobs 8</pre>
 +
#* The ubuntu user is setup to have the gotcloud program and tools in its path, so you can just type the program name and it will be found
 +
#: [[File:RunIndel.png|700px]]
 +
#:* This will take a few minutes to run.
 +
#:* GotCloud first generates a makefile, and then runs the makefile
 +
# When complete, GotCloud indel will indicate success/failure
 +
#:[[File:IndelSuccess.png|700px]]
 +
 
 +
==== Examining Indel Output ====
 +
# Look inside the output directory to see the new indel directories:
 +
#:<pre>ls output</pre>
 +
#:[[File:IndelSnpOutput.png|900px]]
 +
#:#<code>aux</code> - intermediate indel files
 +
#:#<code>final</code> - final output of Indel pipeline
 +
#:#<code>indelvcf</code> - intermediate indel files
 +
#:#<code>gotcloud.indel.conf</code> - file containing all of the configuration settings used for this run of GotCloud
 +
#:#<code>gotcloud.indel.Makefile</code> - Makefile containing commands for this run of GotCloud
 +
#:#<code>gotcloud.indel.Makefile.log</code> - log of the GotCloud run
 +
# Important Indel output is in <code>output/final</code>
 +
#:<pre> ls output/final/</pre>
 +
#: [[File:IndelFinal.png|900px]]
 +
# Look at the final Indel VCF file.
 +
#: <pre>zless output/final/all.genotypes.vcf.gz</pre>
 +
#: Scroll down some: [[File::IndelLess.png|900px]]
 +
## Check that GotCloud found the expected Indel at 22:36662041
 +
##:<pre>tabix output/final/all.genotypes.vcf.gz 22:36662041-36662041|less -S</pre>
 +
##:[[File:IndelTabix.png|900px]]
 +
 
 +
== Exit & Terminate ==
 +
Prior to terminating, make sure you copy any data off of the root EBS volume attached to the instance as it will be deleted when you terminate.
 +
# Exit from your terminal
 +
# Terminate your Amazon Instance
 +
#* Right click on the instance you want to terminate in the EC2 dashboard
 +
#* Select Terminate
 +
#* Select "Yes, Terminate" to indicate you would like to terminate and the storage will be deleted.

Navigation menu