Difference between revisions of "StarCluster"

From Genome Analysis Wiki
Jump to navigationJump to search
Line 82: Line 82:
 
'''Create Your Cluster'''
 
'''Create Your Cluster'''
  
 +
<code>
 +
  '''starcluster start -c myexample myseq-example'''
 +
  StarCluster - (http://web.mit.edu/starcluster) (v. 0.93.3)
 +
  Software Tools for Academics and Researchers (STAR)
 +
  Please submit bug reports to starcluster@mit.edu
 +
 +
  >>> Validating cluster template settings...
 +
  >>> Cluster template settings are valid
 +
  >>> Starting cluster...
 +
  >>> Launching a 3-node cluster...
 +
  >>> Creating security group @sc-myseq-example...
 +
  Reservation:r-c3b4f6f0
 +
  >>> Waiting for cluster to come up... (updating every 30s)
 +
  >>> Waiting for all nodes to be in a 'running' state...
 +
  3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Waiting for SSH to come up on all nodes...
 +
  3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Waiting for cluster to come up took 1.282 mins
 +
  >>> The master node is ec2-50-112-230-67.us-west-2.compute.amazonaws.com
 +
  >>> Setting up the cluster...
 +
  >>> Configuring hostnames...
 +
  3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Creating cluster user: None (uid: 1001, gid: 1001)
 +
  3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Configuring scratch space for user(s): sgeadmin
 +
  3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Configuring /etc/hosts on each node
 +
  3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Starting NFS server on master
 +
  >>> Configuring NFS exports path(s):
 +
  /home
 +
  >>> Mounting all NFS export path(s) on 2 worker node(s)
 +
  2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Setting up NFS took 0.096 mins
 +
  >>> Configuring passwordless ssh for root
 +
  >>> Configuring passwordless ssh for sgeadmin
 +
  >>> Shutting down threads...
 +
  20/20 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Configuring SGE...
 +
  >>> Configuring NFS exports path(s):
 +
  /opt/sge6
 +
  >>> Mounting all NFS export path(s) on 2 worker node(s)
 +
  2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Setting up NFS took 0.048 mins
 +
  >>> Installing Sun Grid Engine...
 +
  2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Creating SGE parallel environment 'orte'
 +
  3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Adding parallel environment 'orte' to queue 'all.q'
 +
  >>> Shutting down threads...
 +
  20/20 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Running plugin createusers
 +
  >>> Creating 2 cluster users
 +
  3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Configuring passwordless ssh for 2 cluster users
 +
  2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Configuring scratch space for user(s): mktrost, tpg
 +
  3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100% 
 +
  >>> Tarring all SSH keys for cluster users...
 +
  >>> Copying cluster users SSH keys to: /tmp/myseq-example-us-west-2.tar.gz
 +
  /tmp/myseq-example-us-west-2.tar.gz 100% |||||||||||| Time: 00:00:00  0.00 B/s
 +
  >>> Configuring cluster took 1.583 mins
 +
  >>> Starting cluster took 2.895 mins
  
'''Login to Master Node'''
+
  The cluster is now ready to use. To login to the master node
 +
  as root, run:
  
 +
      $ starcluster sshmaster myseq-example
  
'''What Did This Cost?'''
+
  If you're having issues with the cluster you can reboot the
 +
  instances and completely reconfigure the cluster from
 +
  scratch using:
 +
 
 +
      $ starcluster restart myseq-example
 +
 
 +
  When you're finished using the cluster and wish to terminate
 +
  it and stop paying for service:
 +
 
 +
      $ starcluster terminate myseq-example
 +
 
 +
  Alternatively, if the cluster uses EBS instances, you can
 +
  use the 'stop' command to shutdown all nodes and put them
 +
  into a 'stopped' state preserving the EBS volumes backing
 +
  the nodes:
 +
 
 +
      $ starcluster stop myseq-example
 +
 
 +
  WARNING: Any data stored in ephemeral storage (usually /mnt)
 +
  will be lost!
 +
 
 +
  You can activate a 'stopped' cluster by passing the -x
 +
  option to the 'start' command:
 +
 
 +
      $ starcluster start -x myseq-example
 +
 
 +
  This will start all 'stopped' nodes and reconfigure the
 +
  cluster.
 +
</code>
 +
 
 +
 
 +
'''Set Up Master Node, Login as root'''
 +
 
 +
<code>
 +
  '''starcluster sshmaster myseq-example'''
 +
  StarCluster - (http://web.mit.edu/starcluster) (v. 0.93.3)
 +
  Software Tools for Academics and Researchers (STAR)
 +
    [lines deleted]
 +
 
 +
  #  Install Pipeline software just like says in
 +
  '''mkdir debs'''
 +
  '''cd debs'''
 +
  '''wget ftp://share.sph.umich.edu/biopipe/current-align.deb'''
 +
    [lines deleted]
 +
  '''wget ftp://share.sph.umich.edu/biopipe/current-umake.deb'''
 +
    [lines deleted]
 +
 
 +
  '''dpkg -i debs/current-align*.deb debs/current-umake*amd64.deb'''
 +
    [lines deleted]
 +
</code>

Revision as of 12:56, 29 October 2012

Back to the beginning [1]

If you have access to your own cluster, your task will be much simpler. Install the Pipeline software (links at [2]) and run it as descibed on the same pages.

For those who are not so lucky to have access to a cluster, AWS provides an alternative. You may run the pipeline software on a cluster created in AWS. One tool that makes the creation of a cluster of AMIs (Amazon Machine Instances) is StarCluster (see http://star.mit.edu/cluster/).

The following shows an example of how you might use starcluster to create and AWS cluster and set it up to run the Pipeline.

We will use starcluster to launch a set of AWS instances. There are many details setting up starcluster and this is not intended to explain all of the many variations you might choose, but should provide you a working example.

The tasks to be completed are:

  • Install and configure starcluster on a machine you use.
  • Create an AWS cluster
  • Install the Pipeline software on the master node
  • Create storage for your sequence data and make it available for the software
  • Run the Pipeline software

Installing and configuring starcluster on your machine is described at http://star.mit.edu/cluster/. Only the second step will be covered here, as the others are described at [3].


StarCluster Configuration Example

StarCluster creates a model configuration file in ~/.starcluster/config and you are instructed to edit this and set the correct values for the variables. Here is a highly simplified example of a config file that should work. Please note there are many things you might want to choose, so craft the starcluster config file with care.

####################################
## StarCluster Configuration File ##
####################################
[global]
DEFAULT_TEMPLATE=myexample

#############################################
## AWS Credentials Settings
#############################################
[aws info]
AWS_ACCESS_KEY_ID = AKImyexample8FHJJF2Q
AWS_SECRET_ACCESS_KEY = fthis_was_my_example_secretMqkMIkJjFCIGf
AWS_USER_ID=199998888709 

AWS_REGION_NAME = us-west-2                 # Choose your own region
AWS_REGION_HOST = ec2.us-west-2.amazonaws.com
AWS_S3_HOST = s3-us-west-2.amazonaws.com

###########################
## EC2 Keypairs
###########################
[key west2_starcluster]
KEY_LOCATION = ~/.ssh/AWS/west2_starcluster_key.rsa   # Same region

###########################################
## Define Cluster
##   starcluster start -c west2_starcluster  nameichose4cluster
###########################################
[cluster myexample]
KEYNAME = west2_starcluster                 # Name I chose
CLUSTER_SIZE = 4                            # Number of nodes
CLUSTER_SHELL = bash

#  Choose the base AMI:  starcluster listpublic
#   (http://star.mit.edu/cluster/docs/0.93.3/faq.html)
NODE_IMAGE_ID = ami-c6bd30f6
AVAILABILITY_ZONE = us-west-2a              # Region again!
NODE_INSTANCE_TYPE = m1.medium              # 4G memory should work for Pipeline


Create Your Cluster

 starcluster start -c myexample myseq-example
 StarCluster - (http://web.mit.edu/starcluster) (v. 0.93.3)
 Software Tools for Academics and Researchers (STAR)
 Please submit bug reports to starcluster@mit.edu

 >>> Validating cluster template settings...
 >>> Cluster template settings are valid
 >>> Starting cluster...
 >>> Launching a 3-node cluster...
 >>> Creating security group @sc-myseq-example...
 Reservation:r-c3b4f6f0
 >>> Waiting for cluster to come up... (updating every 30s)
 >>> Waiting for all nodes to be in a 'running' state...
 3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Waiting for SSH to come up on all nodes...
 3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Waiting for cluster to come up took 1.282 mins
 >>> The master node is ec2-50-112-230-67.us-west-2.compute.amazonaws.com
 >>> Setting up the cluster...
 >>> Configuring hostnames...
 3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Creating cluster user: None (uid: 1001, gid: 1001)
 3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Configuring scratch space for user(s): sgeadmin
 3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Configuring /etc/hosts on each node
 3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Starting NFS server on master
 >>> Configuring NFS exports path(s):
 /home
 >>> Mounting all NFS export path(s) on 2 worker node(s)
 2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Setting up NFS took 0.096 mins
 >>> Configuring passwordless ssh for root
 >>> Configuring passwordless ssh for sgeadmin
 >>> Shutting down threads...
 20/20 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Configuring SGE...
 >>> Configuring NFS exports path(s):
 /opt/sge6
 >>> Mounting all NFS export path(s) on 2 worker node(s)
 2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Setting up NFS took 0.048 mins
 >>> Installing Sun Grid Engine...
 2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Creating SGE parallel environment 'orte'
 3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Adding parallel environment 'orte' to queue 'all.q'
 >>> Shutting down threads...
 20/20 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Running plugin createusers
 >>> Creating 2 cluster users
 3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Configuring passwordless ssh for 2 cluster users
 2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Configuring scratch space for user(s): mktrost, tpg
 3/3 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
 >>> Tarring all SSH keys for cluster users...
 >>> Copying cluster users SSH keys to: /tmp/myseq-example-us-west-2.tar.gz
 /tmp/myseq-example-us-west-2.tar.gz 100% |||||||||||| Time: 00:00:00   0.00 B/s
 >>> Configuring cluster took 1.583 mins
 >>> Starting cluster took 2.895 mins
 The cluster is now ready to use. To login to the master node
 as root, run:
     $ starcluster sshmaster myseq-example
 If you're having issues with the cluster you can reboot the
 instances and completely reconfigure the cluster from
 scratch using:
     $ starcluster restart myseq-example
 When you're finished using the cluster and wish to terminate
 it and stop paying for service:
     $ starcluster terminate myseq-example
 Alternatively, if the cluster uses EBS instances, you can
 use the 'stop' command to shutdown all nodes and put them
 into a 'stopped' state preserving the EBS volumes backing
 the nodes:
     $ starcluster stop myseq-example
 WARNING: Any data stored in ephemeral storage (usually /mnt)
 will be lost!
 You can activate a 'stopped' cluster by passing the -x
 option to the 'start' command:
     $ starcluster start -x myseq-example
 This will start all 'stopped' nodes and reconfigure the
 cluster.


Set Up Master Node, Login as root

 starcluster sshmaster myseq-example
 StarCluster - (http://web.mit.edu/starcluster) (v. 0.93.3)
 Software Tools for Academics and Researchers (STAR)
   [lines deleted]
 #   Install Pipeline software just like says in 
 mkdir debs
 cd debs
 wget ftp://share.sph.umich.edu/biopipe/current-align.deb
   [lines deleted]
 wget ftp://share.sph.umich.edu/biopipe/current-umake.deb
   [lines deleted]
 dpkg -i debs/current-align*.deb debs/current-umake*amd64.deb
   [lines deleted]