Changes

From Genome Analysis Wiki
Jump to navigationJump to search
1,403 bytes added ,  16:07, 12 November 2012
no edit summary
Line 27: Line 27:     
* '''Instance size'''  (memory and number of processors). The pipeline software will require at least 4GB of memory (type m1.medium) and can use as many processors as is available.
 
* '''Instance size'''  (memory and number of processors). The pipeline software will require at least 4GB of memory (type m1.medium) and can use as many processors as is available.
 +
 +
* '''GotCloud Volume''' (copy of the GotCloud snapshot). We provide an AWS snapshot of a small volume
 +
which contains the aligner and umake software and reference files.
 +
Your task is to create an EBS volume based on our snapshot and then mount that volume
 +
on your instance  (see below for more precise details).
    
* '''Storage''' for the instance refers to the size for root (/) partition. This can be quite small, as little as 8GB should work. Of course if you intend to bring lots of other files/programs to the instance, you may want to increase this to something a bit larger (e.g. 30GB).
 
* '''Storage''' for the instance refers to the size for root (/) partition. This can be quite small, as little as 8GB should work. Of course if you intend to bring lots of other files/programs to the instance, you may want to increase this to something a bit larger (e.g. 30GB).
 +
 +
 +
'''Prepare Your Instance'''
 +
 +
You will also want additional storage volumes for:
 +
 +
* GotCloud software and reference files
 +
* Your sequence data
 +
* Output of the aligner
 +
* Output of umake
 +
 +
The first of these is a small volume based on a snapshot containing the GotCloud files you will need.
 +
We provide an AWS snapshot of a small volume which contains the aligner and umake software and reference files.
 +
Create an EBS volume based on our snapshot and then mount that volume on your instance.
 +
 +
 
 +
 +
 +
 +
You should expect these will all require the same amount of storage. That is, if your sequence data is 300GB, then you'll need an additional 300GB for the aligner output and then another 300GB of storage for the umake output.
 +
We suggest you consider making each of these storage volume be separate volumes.
 +
You may also find that your sequence data is too large to be easily handled in one go,
 +
so you might choose to only use the aligner/umake on part of your sequence data, capture the files
 +
of interest from umake, and then go back and rerun the software with the next bit of sequence data.
 +
     
283

edits

Navigation menu