Difference between revisions of "StarCluster"
Line 45: | Line 45: | ||
*** If you need help setting up your AWS credentials, see: [[AWS Credentials]] | *** If you need help setting up your AWS credentials, see: [[AWS Credentials]] | ||
+ | You can skip the cluster start section if you want. | ||
+ | ''' Troubleshooting: ''' When I tried this, the <code>starcluster start mycluster</code> step failed similar to: | ||
+ | * http://star.mit.edu/cluster/mlarchives/2425.html | ||
+ | * So I followed the suggestions there and at https://github.com/jtriley/StarCluster/issues/455: | ||
+ | *# <pre>$ sudo pip uninstall boto</pre> | ||
+ | *# <pre>$ sudo easy_install boto==2.32.0</pre> | ||
+ | *#* I was having trouble with pip install, but found that easy_install worked | ||
+ | * I had to force terminate mycluster after a failed start: | ||
+ | ** <pre> starcluster terminate -f mycluster</pre> | ||
+ | * Then I was able to successfully start my cluster | ||
+ | ''' Don't forget to terminate your cluster:''' | ||
+ | starcluster terminate mycluster | ||
+ | == StarCluster and GotCloud == | ||
+ | By default, StarCluster expects a configuration file in ~/.starcluster/config. | ||
+ | * StarCluster will create a model file for you | ||
+ | |||
+ | Ensure your StarCluster configuration file is set for your usage. | ||
+ | * General AWS Settings: | ||
+ | [aws info] | ||
+ | aws_access_key_id = #your aws access key id here | ||
+ | aws_secret_access_key = #your secret aws access key here | ||
+ | aws_user_id = #your 12-digit aws user id here | ||
+ | * You should have set these in [[#Getting Started With StarCluster|Getting Started With StarCluster]] above (quickstart guide and AWS Credentials) . | ||
+ | |||
+ | '''GotCloud settings:''' | ||
+ | * You may want to Create a new cluster description for running GotCloud (or you can use smallcluster) | ||
+ | * Use the GotCloud AMIs: | ||
+ | MASTER_IMAGE_ID = ami-6ae65e02 | ||
+ | NODE_IMAGE_ID = ami-3393a45a | ||
+ | * We do not recommend running GotCloud on machines with less than 4MB of memory | ||
+ | |||
+ | [[GotCloud: AMIs]] | ||
+ | |||
+ | === Run GotCloud Demo Using StarCluster === | ||
+ | Be sure to set: | ||
+ | MASTER_IMAGE_ID = ami-6ae65e02 | ||
+ | NODE_IMAGE_ID = ami-3393a45a | ||
+ | |||
+ | #Create a new cluster section in your configuration file: <code>~/.starcluster/config</code> | ||
+ | #* Add the following to the end of the configuration file: | ||
+ | #*: <pre>[cluster gccluster] KEYNAME = mykey CLUSTER_SIZE = 4 CLUSTER_USER = sgeadmin CLUSTER_SHELL = bash MASTER_IMAGE_ID = ami-6ae65e02 NODE_IMAGE_ID = ami-3393a45a NODE_INSTANCE_TYPE = m3.large</pre> | ||
+ | # Start the cluster: | ||
+ | # <pre>starcluster start -c gccluster mycluster</pre> | ||
+ | #* Alternatively, you can change the default template at the start of the configuration file in the <code>[global]</code> section to gccluster: <code>DEFAULT_TEMPLATE=gccluster</code> | ||
+ | # Logon to the cluster as ubuntu: | ||
+ | #* <pre>starcluster sshmaster -u ubuntu mycluster</pre> | ||
+ | # Run GotCloud snpcall | ||
+ | #* <pre>gotcloud snpcall --conf example/test.conf --outdir output --numjobs 8 --batchtype sgei</pre> | ||
+ | # Run GotCloud indell | ||
+ | #* <pre>gotcloud snpcall --conf example/test.conf --outdir output --numjobs 8 --batchtype sgei</pre> | ||
+ | # Terminate the cluster | ||
+ | #* <pre>starcluster terminate mycluster</pre> | ||
+ | |||
+ | |||
+ | |||
+ | == Old Instructions== | ||
'''StarCluster Configuration Example''' | '''StarCluster Configuration Example''' | ||
Revision as of 17:18, 30 October 2014
Back to the beginning: GotCloud
Back to GotCloud: Amazon
This page is in the process of being updated...(10/29/14)
|
If you have access to your own cluster, your task will be much simpler.
Install the GotCloud software (GotCloud: Source Releases)
and run it as described on the same pages.
For those who are not so lucky to have access to a cluster, Amazon Web Services (AWS) provides an alternative. You may run the gotcloud software on a cluster created in AWS. One tool that makes the creation of a cluster of AMIs (Amazon Machine Instances) is StarCluster (see http://star.mit.edu/cluster/).
The following shows an example of how you might use StarCluster to create an AWS cluster and set it up to run GotCloud. There are many details setting up starcluster and this is not intended to explain all of the many variations you might choose, but should provide you a working example.
Tasks to be completed
- Install the ec2 tools package (ec2-api-tools for Ubuntu) on your machine (optional)
- Install and configure starcluster on your machine (required)
- Note: gotcloud requires a 64bit machine
- Please use
NODE_IMAGE_ID = ami-765b3e1f
- Create an EBS volume based on the GotCloud snapshot
- Configure StarCluster to use the volume just created
- Create an AWS cluster
- Create storage for your sequence data and make it available for the software
- Run the GotCloud software
Getting Started With StarCluster
StarCluster provides lots of documentation.
To install and setup StarCluster for the first time, you can follow the QuickStart instructions: http://star.mit.edu/cluster/docs/latest/quickstart.html
- Includes installation instructions
- See http://star.mit.edu/cluster/docs/latest/installation.html for more detailed StarCluster installation instructions if the QuickStart instructions are not enought (especially if running on Windows)
- Includes setting up a basic StarCluster configuration file
- You will need your AWS Credentials to setup the configuration file
- If you need help setting up your AWS credentials, see: AWS Credentials
- You will need your AWS Credentials to setup the configuration file
You can skip the cluster start section if you want.
Troubleshooting: When I tried this, the starcluster start mycluster
step failed similar to:
- http://star.mit.edu/cluster/mlarchives/2425.html
- So I followed the suggestions there and at https://github.com/jtriley/StarCluster/issues/455:
$ sudo pip uninstall boto
$ sudo easy_install boto==2.32.0
- I was having trouble with pip install, but found that easy_install worked
- I had to force terminate mycluster after a failed start:
starcluster terminate -f mycluster
- Then I was able to successfully start my cluster
Don't forget to terminate your cluster:
starcluster terminate mycluster
StarCluster and GotCloud
By default, StarCluster expects a configuration file in ~/.starcluster/config.
- StarCluster will create a model file for you
Ensure your StarCluster configuration file is set for your usage.
- General AWS Settings:
[aws info] aws_access_key_id = #your aws access key id here aws_secret_access_key = #your secret aws access key here aws_user_id = #your 12-digit aws user id here
- You should have set these in Getting Started With StarCluster above (quickstart guide and AWS Credentials) .
GotCloud settings:
- You may want to Create a new cluster description for running GotCloud (or you can use smallcluster)
- Use the GotCloud AMIs:
MASTER_IMAGE_ID = ami-6ae65e02 NODE_IMAGE_ID = ami-3393a45a
- We do not recommend running GotCloud on machines with less than 4MB of memory
Run GotCloud Demo Using StarCluster
Be sure to set:
MASTER_IMAGE_ID = ami-6ae65e02 NODE_IMAGE_ID = ami-3393a45a
- Create a new cluster section in your configuration file:
~/.starcluster/config
- Add the following to the end of the configuration file:
[cluster gccluster] KEYNAME = mykey CLUSTER_SIZE = 4 CLUSTER_USER = sgeadmin CLUSTER_SHELL = bash MASTER_IMAGE_ID = ami-6ae65e02 NODE_IMAGE_ID = ami-3393a45a NODE_INSTANCE_TYPE = m3.large
- Add the following to the end of the configuration file:
- Start the cluster:
starcluster start -c gccluster mycluster
- Alternatively, you can change the default template at the start of the configuration file in the
[global]
section to gccluster:DEFAULT_TEMPLATE=gccluster
- Alternatively, you can change the default template at the start of the configuration file in the
- Logon to the cluster as ubuntu:
starcluster sshmaster -u ubuntu mycluster
- Run GotCloud snpcall
gotcloud snpcall --conf example/test.conf --outdir output --numjobs 8 --batchtype sgei
- Run GotCloud indell
gotcloud snpcall --conf example/test.conf --outdir output --numjobs 8 --batchtype sgei
- Terminate the cluster
starcluster terminate mycluster
Old Instructions
StarCluster Configuration Example
StarCluster creates a model configuration file in ~/.starcluster/config and you are instructed to edit this and set the correct values for the variables. Here is a highly simplified example of a config file that should work. Please note there are many things you might want to choose, so craft the config file with care. You'll need to specify nodes with 4GB of memory (type m1.medium) and make sure each node has access to the input and output data for the step being run.
####################################
## StarCluster Configuration File ##
####################################
[global]
DEFAULT_TEMPLATE=myexample
#############################################
## AWS Credentials Settings
#############################################
[aws info]
AWS_ACCESS_KEY_ID = AKImyexample8FHJJF2Q
AWS_SECRET_ACCESS_KEY = fthis_was_my_example_secretMqkMIkJjFCIGf
AWS_USER_ID=199998888709
AWS_REGION_NAME = us-east-1 # Choose your own region
AWS_REGION_HOST = ec2.us-east-1.amazonaws.com
AWS_S3_HOST = s3-us-east-1.amazonaws.com
###########################
## EC2 Keypairs
###########################
[key east1_starcluster]
KEY_LOCATION = ~/.ssh/AWS/east1_starcluster_key.rsa # Same region
###########################################
## Define Cluster
## starcluster start -c east1_starcluster nameichose4cluster
###########################################
[cluster myexample] # Name of this cluster definition
KEYNAME = east1_starcluster # Name of keys I need
CLUSTER_SIZE = 4 # Number of nodes
CLUSTER_SHELL = bash
# Choose the base AMI using starcluster listpublic
# (http://star.mit.edu/cluster/docs/0.93.3/faq.html)
NODE_IMAGE_ID = ami-765b3e1f
AVAILABILITY_ZONE = us-east-1 # Region again!
NODE_INSTANCE_TYPE = m1.medium # 4G memory is the minimum for GotCloud
VOLUMES = gotcloud, mydata
[volume mydata]
VOLUME_ID = vol-6e729657
MOUNT_PATH = /mydata
[volume gotcloud]
VOLUME_ID = vol-56071570
MOUNT_PATH = /gotcloud
Create Your Cluster
starcluster start -c myexample myseq-example
StarCluster - (http://web.mit.edu/starcluster) (v. 0.93.3)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster@mit.edu
>>> Validating cluster template settings...
>>> Cluster template settings are valid
>>> Starting cluster...
[lines deleted]
>>> Mounting EBS volume vol-32273514 on /gotcloud...
>>> Mounting EBS volume vol-36788522 on /mydata...
[lines deleted]
When this completes, you are ready to run the GotCloud software on your data. Make sure you have defined and mounted volumes for your sequence data and the output steps of the aligner and umake. These volumes (as well as /gotcloud) should be available on each node.
starcluster sshmaster myseq-example
StarCluster - (http://web.mit.edu/starcluster) (v. 0.93.3)
Software Tools for Academics and Researchers (STAR)
[lines deleted]
df -h
ssh node001 df -h
If your data is visible on each node, you're ready to run the software as described in GotCloud.
Running GotCloud on StarCluster
To tell GotCloud to run data on the StarCluster you have setup, specify the following on your gotcloud command-line:
-batchtype sgei
Alternatively, you can set the following in your configuration file:
BATCH_TYPE = sgei