Changes

1,643 bytes added , 13:06, 20 May 2013

no edit summary

Line 1: Line 1: −

Back to the beginning [~~http://genome.sph.umich.edu/wiki/~~GotCloud]

+

Back to the beginning: [[GotCloud]]

−

~~GotCloud is made available in various forms. In Amazon Web Services the software~~

+

'''We no longer use a snapshot.''' It is very likely that you will need quite a few packages installed

−

~~is made available in a EBS (Elastic Block Store)~~ '''~~Snapshot~~'''.

+

so that you can compile your software, access the EC2 application data or access data on S3.

−

~~This~~ is ~~simple~~ a ~~copy of a data volume we have created~~ that ~~has our~~ software

+

It just seemed foolish to not make these software available in an AMI.

−

~~and~~, ~~additionally, some reference files you will find useful~~.

−

~~You simply need~~ to ~~create your own EBS volume from our snapshot, mount your~~

−

~~new volume and you are ready~~.

−

If ~~this does not work~~ or is unacceptable for some reason, you may install the

+

GotCloud is made available in various forms.

+

It is distributed as conventional packages for Ubuntu and as compress TAR files for others.

+

In addition the source is available from github.

+

In Amazon Web Services the software is made available as an Amazon Machine Instance (AMI).

+

The GotCloud software itself only requires a few packages to be installed for Ubuntu installations

+

(java-common default-jre make libssl0.9.8).

+

However, there are a number of things you may well want to do in getting your data

+

ready for processing (access data on S3 storage, compile GotCloud or others, or

+

access the EC2 application data).

+

Assuming this is the case, the GotCloud AMI has installed these packages on Ubunutu.

+

If you need to run on some other distribution, you may need to install their packages.

+

<code>

+

sudo apt-get install java-common default-jre make libssl0.9.8

+

sudo apt-get install libnet-amazon-ec2-perl

+

sudo apt-get install make g++ libcurl4-openssl-dev libssl-dev libxml2-dev libfuse-dev

+

</code>

+

You will almost certainly need to fetch and install your own reference files - regardless

+

of the details of the system you are using.

+

Finally, you'll need access to your FASTQ files - either copied to the Amazon instance

+

or perhaps accessible from S3 storage.

+

If the GotCloud instance is unacceptable for some reason, you may install the

software and reference files wherever you'd like

−

(read about this in [~~http://genome.sph.umich.edu/wiki/~~Pipeline_Debian_Package|Installing from a Debian package].

+

(read about this in [[Pipeline_Debian_Package|Installing from a Debian package]]).

Your first task is get an AWS account and keys so that you can use the AWS EC2 Console Dashboard

Line 19: Line 42: −

'''~~Launch~~ Your First Instance'''

+

'''Your First Instance'''

You'll need to know some details when launching an instance:

−

* '''Launch an Instance''' ~~to launch. Any~~ instance running 64 bit software and

+

* '''Launch an Instance''' - use the GotCloud instance running 64 bit software.

−

~~either an Ubuntu~~ of ~~any version or Redhat/CentOS 6~~.~~3 distribution~~ will ~~work~~.

+

* '''Instance size''' (memory and number of processors). The pipeline software will require at least 4GB of memory (''type m1.medium'') and can use as many processors as is available.

−

* '''~~Instance size~~''' (~~memory and number of processors~~). ~~The pipeline software will require at least 4GB of memory~~ (~~type m1~~.~~medium~~) ~~and can use as many processors as is available~~.

+

* '''Storage''' for the instance refers to the size for root (/) partition. This can be quite small, as little as 8GB can work. Of course if you intend to bring other files/programs to the instance, you may need to increase this to something a bit larger (e.g. 30GB).

−

* '''Storage''' for the instance refers to the size for root (/) partition. This can be quite small, as little as 8GB should work. Of course if you intend to bring lots of other files/programs to the instance, you may want to increase this to something a bit larger (~~e.g. 30GB~~).

+

* '''Data Storage''' for the aligner or SNP caller (see below)

'''Prepare Your Instance'''

−

If you ~~launched some~~ other instance than the ~~one prepared~~ for ~~our software~~, ~~you~~ will need to ~~install~~

+

You will also want additional storage volumes for:

−

the ~~Pipeline software~~. ~~This is quite simple - see [[Pipeline Debian Package|debian package]] or~~

+

−

~~[[Pipeline RedHatPackage|red hat package]]~~.

+

* '''Local Storage''' for the instance refers to the size for root (/) partition. This can be quite small, as little as 8GB can work. Of course if you intend to bring other files/programs to the instance, you may need to increase this to something a bit larger (e.g. 30GB).

−

This ~~should only~~ take ~~15 minutes~~.

+

* '''Data Storage''' for the aligner or SNP caller will likely be far larger than the system you are creating.

+

You'll need to create EBS Volumes for the input and output of the aligner and SNP caller.

+

'''Prepare Your Storage'''

+

These can be quite substantial and because of that we recommend you create separate volumes like this:

+

* Your '''input FASTQ''' files for the aligner.

+

This may have been done for you by some vendor when they put your FASTQ data on an S3 volume.

+

If so, your vendor will need to provide you with the details of how to access your FASTQ files.

+

If your FASTQ files are not in S3 storage, you'll have to create a volume for this and copy your data into it.

+

This can take a very long time.

+

* The '''output of the aligner''' (BAM files)

+

* The '''intermediate files of the SNP caller''' (GLF files)

−

The ~~last step is to organize your storage so you have enough space for the input sequence data~~

+

* The '''final output of the SNP caller''' (VCF files)

−

~~and the~~ output of the ~~aligner and umake steps.~~

−

~~This is described in more detail in [[Amazon Storage|Amazon Storage]].~~

−

~~If you are not using AWS, the process will be similar to that described above,~~

−

~~but the details will vary based on your environment.~~

Terry Gliedt

283

edits

Changes

Amazon Snapshot (view source)

Revision as of 13:06, 20 May 2013

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools