Amazon Storage

Back to parent [1]

Setting up your storage is perhaps the most difficult step as it is controlled completely by the size of your data. As a generate rule you will need three times the space required for your sequence data. For instance in the 1000 Genomes data, the data for one individual takes about 45G. If you have 1000 Genome data for nine individuals, you'll need about 1500GB of space (9x450x3 plus a little extra space).

Making your data available for the Pipeline can be accomplished in many ways. Here is a simple straightforward organization you might want to use.

Using the AWS EC2 Console Dashboard create one EBS volume (ELASTIC BLOCK STORE -> Volumes) for the sequence data (e.g. 500GB).
Using the Dashboard create another EBS volume for the output of the aligner step (e.g. another 500GB).
Using the Dashboard create another EBS volume for the output of the umake step (e.g. another 500GB).

Configure these EBS volumes so they use separate devices devices g, g and h (e.g. /dev/sdf (probably /dev/xvdf),

/dev/sdg (probably /dev/xvdg) and /dev/sdh (probably /dev/xvdh)).

Launch your instance and login as explained in the AWS documentation. This first time you need to prepare the disks

Amazon Storage

Navigation menu

Page actions

Page actions

Personal tools

quick links

teaching

Navigation

Search

Tools