Difference between revisions of "Amazon Snapshot"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 1: Line 1:
 +
Back to parent [http://genome.sph.umich.edu/wiki/Pipelines]
 +
 
You may run the pipeline software on a single instance we have created for you in AWS.
 
You may run the pipeline software on a single instance we have created for you in AWS.
You may also, of course, create your own instance and run it there.
+
You may also create your own AWS instance and run it there.
 +
You may also, of course, install and run the software on your own hardware.
  
 
Your first task is get an AWS account and keys so that you can use the AWS EC2 Console Dashboard  
 
Your first task is get an AWS account and keys so that you can use the AWS EC2 Console Dashboard  

Revision as of 08:38, 29 October 2012

Back to parent [1]

You may run the pipeline software on a single instance we have created for you in AWS. You may also create your own AWS instance and run it there. You may also, of course, install and run the software on your own hardware.

Your first task is get an AWS account and keys so that you can use the AWS EC2 Console Dashboard (see https://console.aws.amazon.com/ec2/). From here you can launch instances prepared by others or create your own. We cannot assist in this step - Amazon has plenty of documentation. Once you are at the AWS EC2 Console Dashboard, you're ready to run the pipeline.


Launch Your First Instance

You'll need to know some details when launching an instance:

  • What Instance to launch. You have several choices
    • ami-be59d78e which is an instance we have prepared based on Ubuntu Server 12.04.1 LTS. It has all of our software installed.
    • Some other instance. The instance must run 64 bit software and is either Ubuntu of any version or Redhat/CentOS 6.3. You will also need to install the Pipeline software.
  • Instance size (memory and number of processors). The pipeline software will require at least 8GB of memory (type m1.large) and can use as many processors as is available.
  • Storage for the instance refers to the size for root (/) partition. This can be quite small, as little as 8GB should work. Of course if you intend to bring lots of other files/programs to the instance, you may want to increase this to something a bit larger (e.g. 30GB).


Prepare Your Instance

If you launched some other instance than the one prepared for our software, you will need to install the Pipeline software. This is quite simple - see debian package or red hat package. This should only take 15 minutes.

Setting up your storage is perhaps the most difficult step as it is controlled completely by the size of your data. As a generate rule you will need three times the space required for your sequence data. For instance in the 1000 Genomes data, the data for one individual takes about 45G. If you have 1000 Genome data for nine individuals, you'll need about 1500GB of space (9x450x3 plus a little extra space).

Making your data available for the Pipeline can be accomplished in many way. Here is a simple straightforward organization you might want to use.





Testing the Installation

We recommend that at least the first time, you install the test packages so you can conveniently test the installation and make sure everything runs smoothly. The tests run within a few minutes and are self checking, so unless you see obvious errors, you can be reasonably sure everything is set up properly. You only need to do this once, unless you have made signifcant changes to your Unix system.

sudo dpkg -i debs/biopipe-test*_amd64.deb
Unpacking biopipe-testalign (from .../biopipe-testalign_M.n_amd64.deb) ...
Selecting previously deselected package biopipe-testumake.
Unpacking biopipe-testumake (from .../biopipe-testumake_M.n_amd64.deb) ...
Setting up biopipe-testalign (M.n) ...
To test the pipeline, run:

  /usr/local/biopipe/bin/gen_biopipeline.pl --test ~/testalign

This will remove the contents of ~/testalign and then run
the aligner test case. The output is verified so you know if
anything failed or not.

Setting up biopipe-testumake (M.n) ...
To test umake, run:

  /usr/local/biopipe/bin/umake.pl --test ~/testumake

This will remove the contents of ~/testumake and then run
the umake test case. The output is verified so you know if
anything failed or not.

Login as a normal user (not as root) and do:

#   Test the aligner (fast, about 3 minutes)
/usr/local/biopipe/bin/gen_biopipeline.pl --test ~/testalign
rm -rf ~/testalign              # If no error

#   Test umake  (longer, about 15 minutes)
/usr/local/biopipe/bin/umake.pl --test ~/testumake
rm -rf ~/testumake              # If no error