Line 1: |
Line 1: |
| __TOC__ | | __TOC__ |
| | | |
− | == Initial Setup ==
| + | Back to [[GotCloud]] |
− | After creating an amazon account (https://aws.amazon.com/), you can run the GotCloud demo on Amazon.
| |
| | | |
− | == Starting GotCloud AMI == | + | Back to [[GotCloud: Amazon]] |
− | The GotCloud AMI has the Demo data built into it. | + | |
| + | == Introduction == |
| + | This Amazon demo runs through the GotCloud SNP and INDEL calling pipelines. |
| + | |
| + | The data used for this demo is originally from our sequencing workshop demos. We also have alignment and structural variation demos. |
| + | |
| + | Links to the general GotCloud Demos (originally from our sequencing workshop): |
| + | * [[SeqShop: Sequence Mapping and Assembly Practical]] |
| + | * [[SeqShop: Variant Calling and Filtering for SNPs Practical]] |
| + | * [[SeqShop: Variant Calling and Filtering for INDELs Practical]] |
| + | * [[SeqShop: Analysis of Structural Variation Practical]] |
| + | |
| + | |
| + | To run this Demo on an Amazon Cluster rather than on a single node, see: [[StarCluster#Run_GotCloud_Demo_Using_StarCluster|StarCluster -> Run GotCloud Demo Using StarCluster]] |
| + | |
| + | |
| + | == Starting up a Node == |
| + | See [[Amazon Single Node]] for instructions on starting a node and getting a terminal running. |
| + | * For the demo, we recommend using a <code>c3.2xlarge</code> instance. |
| + | |
| + | == Running the Demo on Already Running Node == |
| + | {{GotCloud: Amazon Demo Setup}} |
| + | |
| + | === Run GotCloud SnpCall === |
| + | Now that we have examined the instance files, run GotCloud snpcall |
| + | # <pre>gotcloud snpcall --conf example/test.conf --outdir output --numjobs 8</pre> |
| + | #* The ubuntu user is setup to have the gotcloud program and tools in its path, so you can just type the program name and it will be found |
| + | #: [[File:RunSnpCall.png|700px]] |
| + | #:* This will take a few minutes to run. |
| + | #:* GotCloud first generates a makefile, and then runs the makefile |
| + | #:* After a while GotCloud snpcall will print some messages to the screen. This is expected and ok. |
| + | # When complete, GotCloud snpcall will indicate success/failure |
| + | #:[[File:SnpcallSuccess.png|700px]] |
| + | |
| + | ==== Examining SnpCall Output ==== |
| + | # ls on the ubuntu home directory to see the new output directory: |
| + | #:<pre>ls</pre> |
| + | #:[[File:SnpCallOutput1.png|500px]] |
| + | # Look inside the output directory: |
| + | #:<pre>ls output</pre> |
| + | #:[[File:SnpCallOutput2.png|700px]] |
| + | #:#<code>glfs</code> - intermediate per sample genotype likelihood files |
| + | #:#<code>jobfiles</code> - empty; was used to store commands as GotCloud was running |
| + | #:#<code>pvcfs</code> - intermediate vcfs with per sample information |
| + | #:#<code>split</code> - contains per chromosome directories with vcfs containing PASS only snps split up as required for beagle (part of ldrefine) |
| + | #:#<code>target</code> - contains the bed with the region to be processed |
| + | #:#<code>umake.snpcall.conf</code> - file containing all of the configuration settings used for this run of GotCloud |
| + | #:#<code>umake.snpcall.Makefile</code> - Makefile containing commands for this run of GotCloud |
| + | #:#<code>umake.snpcall.Makefile.cluster</code> - Makefile log of start/stop times of various steps |
| + | #:#<code>umake.snpcall.Makefile.log</code> - log of the GotCloud run |
| + | #:#<code>vcfs</code> - contains per chromosome directories with vcfs |
| + | #:#* important output is |
| + | #:#*# filtered.vcf.gz file: <code>vcfs/chr22/chr22.filtered.vcf.gz</code> |
| + | #:#*# summary information: <code>vcfs/chr22/chr22.filtered.sites.vcf.summary</code> |
| + | # Look at the filtered file. |
| + | #: <pre>zless -S output/vcfs/chr22/chr22.filtered.vcf.gz</pre> |
| + | #: Scroll down some: [[File:SnpVcfOutput.png|700px]] |
| + | ## Check that GotCloud found the expected SNP at 22:36661906 |
| + | ##:<pre>tabix output/vcfs/chr22/chr22.filtered.vcf.gz 22:36661906 |head -1</pre> |
| + | #:[[File:SnpTabix.png|700px]] |
| + | # Look at the summary information |
| + | #:<pre>cat output/vcfs/chr22/chr22.filtered.sites.vcf.summary</pre> |
| + | #:[[File:SnpSummary.png|700px]] |
| + | #:*'''To understand how to interpret the filtering summary statistics, please refer to [[Understanding vcf-summary output]]''' |
| + | |
| + | |
| + | === Run GotCloud Indel === |
| + | Now that we have examined the instance files, run GotCloud indel |
| + | # <pre>gotcloud indel --conf example/test.conf --outdir output --numjobs 8</pre> |
| + | #* The ubuntu user is setup to have the gotcloud program and tools in its path, so you can just type the program name and it will be found |
| + | #: [[File:RunIndel.png|700px]] |
| + | #:* This will take a few minutes to run. |
| + | #:* GotCloud first generates a makefile, and then runs the makefile |
| + | # When complete, GotCloud indel will indicate success/failure |
| + | #:[[File:IndelSuccess.png|700px]] |
| + | |
| + | ==== Examining Indel Output ==== |
| + | # Look inside the output directory to see the new indel directories: |
| + | #:<pre>ls output</pre> |
| + | #:[[File:IndelSnpOutput.png|900px]] |
| + | #:#<code>aux</code> - intermediate indel files |
| + | #:#<code>final</code> - final output of Indel pipeline |
| + | #:#<code>indelvcf</code> - intermediate indel files |
| + | #:#<code>gotcloud.indel.conf</code> - file containing all of the configuration settings used for this run of GotCloud |
| + | #:#<code>gotcloud.indel.Makefile</code> - Makefile containing commands for this run of GotCloud |
| + | #:#<code>gotcloud.indel.Makefile.log</code> - log of the GotCloud run |
| + | # Important Indel output is in <code>output/final</code> |
| + | #:<pre> ls output/final/</pre> |
| + | #: [[File:IndelFinal.png|900px]] |
| + | # Look at the final Indel VCF file. |
| + | #: <pre>zless output/final/all.genotypes.vcf.gz</pre> |
| + | #: Scroll down some: [[File::IndelLess.png|900px]] |
| + | ## Check that GotCloud found the expected Indel at 22:36662041 |
| + | ##:<pre>tabix output/final/all.genotypes.vcf.gz 22:36662041-36662041|less -S</pre> |
| + | ##:[[File:IndelTabix.png|900px]] |
| + | |
| + | == Exit & Terminate == |
| + | Prior to terminating, make sure you copy any data off of the root EBS volume attached to the instance as it will be deleted when you terminate. |
| + | # Exit from your terminal |
| + | # Terminate your Amazon Instance |
| + | #* Right click on the instance you want to terminate in the EC2 dashboard |
| + | #* Select Terminate |
| + | #* Select "Yes, Terminate" to indicate you would like to terminate and the storage will be deleted. |