Changes

From Genome Analysis Wiki
Jump to navigationJump to search
2,645 bytes added ,  14:54, 14 October 2014
Line 4: Line 4:     
The following are notes taken when creating the Amazon Machine Instance used for the CSG pipeline process.
 
The following are notes taken when creating the Amazon Machine Instance used for the CSG pipeline process.
 +
 
These notes assume you have already created an EC2 account and have the certificates and keys set up properly.
 
These notes assume you have already created an EC2 account and have the certificates and keys set up properly.
   −
== Launch an instance ==
+
 
 +
== Create new GotCloud AMI from StarCluster AMI ==
 +
=== Launch an instance ===
    
<code>
 
<code>
Line 13: Line 16:     
Pay attention to the region you are using, at least for now it seems any StarCluster activity must be in '''us-east-1'''.
 
Pay attention to the region you are using, at least for now it seems any StarCluster activity must be in '''us-east-1'''.
Launch a new instance which we will use to set up the software and ultimately save it as an AMI.
     −
<code>
+
Launch a new instance starting from a StarCluster AMI.  We will use set up the software on this instance and ultimately save it as an AMI.
  EC2 DashBoard -> Launch Instance
+
 
  Class Wizard
+
# <code>EC2 DashBoard -> Launch Instance</code>
  Ubuntu Server 12.04.1 LTS  64 bit
+
# Select: <code>Community AMIs</code>
  Instance type -> Micro, EC2, no preference        # Memory size does not matter
+
## Enter in the search box: <code>starcluster-base-ubuntu</code>
  Advanced Instance Options  (take defaults)
+
## Select: <code>starcluster-base-ubuntu-12.04-x86_64 - ami-765b3e1f</code>
  Storage Device Configuration -> Edit
+
# Select the Instance Type: <code>Compute optimized c3.2xlarge</code>
  Change volume to 30G -> Save -> Continue          # Storage size does not matter
+
#* You can use a smaller/cheaper machine - I originaly used t1.micro, but I found things go so much faster with a larger machine.
  Key Name = GotCloud 1.06a
+
# Click: <code>Review and Launch</code>
  Create Key/Pair if you need to, Name the PEM and save the pem file for access by ssh
+
## Select: <code>Make General Purpose (SSD) the boot volume for this instance.</code>
  Choose a Security Group  (take default)
+
## Select: <code>Next</code>
  Launch
+
# Scroll down to the <code>Storage</code> section
    No need to Create Status Check Alarms
+
# Click: <code>Edit storage</code>
    No need to Create EBS Volumes
+
## Update the Size: <code>30</code>
</code>
+
##* We use 30G to fit the GotCloud code and reference files.  Make it larger if you want additional space.
 +
## Click: <code>Review and Launch</code>
 +
# Click: <code>Launch</code>
 +
# Select the key/pair you want to use & Launch
 +
 
 +
=== Setup the instance with GotCloud ===
 +
This assumes you have already logged onto the instance.
 +
 
 +
# Get the latest version of GotCloud:
 +
#* Multiples ways to do this, one way is to do:
 +
#*# <code>sudo git clone https://github.com/statgen/gotcloud.git</code>
 +
# Download cmake (required to build premo)
 +
#*<code>sudo apt-get update</code>
 +
#*<code>sudo apt-get upgrade</code>  (takes a while, may be able to skip this step)
 +
#*<code>sudo apt-get install cmake</code>
 +
## Build the source (if you obtained the source code).
 +
### <code>cd gotcloud/src</code>
 +
### <code>sudo make</code>
 +
###* Specify <code>-j #</code> based on the number of CPUs your instance has, if more than 1
 +
### <code>cd</code>
 +
# Get the reference files
 +
## wget ftp://anonymous@share.sph.umich.edu/gotcloud/ref/h37-db135-v3.tgz
 +
# Untar: <code>tar xvf h37-db135-v3.tgz</code>
 +
# Move reference to gotcloud directory: <code>sudo mv gotcloud.ref gotcloud</code>
 +
# Remove tar file: <code>rm h37-db135-v3.tgz</code>
 +
# Set the paths, by updating .profile: <code>vi .profile</code>
 +
#* <code>i</code>
 +
#: <pre>if [ -d "$HOME/gotcloud" ] ; then&#10;    PATH="$HOME/gotcloud:$PATH"&#10;fi&#10;if [ -d "$HOME/gotcloud/bin" ] ; then&#10;    PATH="$HOME/gotcloud/bin:$PATH"&#10;fi&#10;if [ -d "$HOME/gotcloud/scripts" ] ; then&#10;    PATH="$HOME/gotcloud/scripts:$PATH"&#10;fi</pre>
 +
#* <code>ESC</code>
 +
#* <code>:q</code>
   −
== Set Up Swap Space ==
+
=== Set Up Swap Space ===
    
Issue the command '''swapon -s''' to see if there is swap space.
 
Issue the command '''swapon -s''' to see if there is swap space.
Line 51: Line 82:  
</code>
 
</code>
   −
== Install the Software ==
+
=== Cleanup the instance for creating an AMI ===
 +
# Go to : [[#Cleanup Instance for AMI Creation|Cleanup Instance for AMI Creation]]
 +
 
 +
=== Create the AMI ===
 +
# Go to : [[#Create the AMI|Create the AMI]]
 +
 
 +
 
 +
== Update the GotCloud AMI ==
 +
# Start an instance of the current GotCloud AMI
 +
#* Suggest an instance with some CPU so you can parallelize the "make" call.
 +
# Login as ubuntu
 +
# <code>cd gotcloud</code>
 +
# <code>sudo git pull</code>
 +
# <code>cd gotcloud/src</code>
 +
# <code>sudo make</code>
 +
#* Specify <code>-j #</code> based on the number of CPUs your instance has
 +
# <code>cd</code>
 +
# Go to : [[#Create the AMI|Create the AMI]]
 +
 
 +
 
 +
==Cleanup Instance for AMI Creation==
 +
First time from generic/starcluster AMI
 +
# Disable password-based logins for root
 +
## Open /etc/ssh/sshd_config
 +
## Change <code>PermitRootLogin yes</code> to <code>PermitRootLogin without-password</code>
 +
# Disable root access
 +
## <code> sudo passwd -l root</code>
 +
 
 +
 
 +
Each time we generate a new AMI, run:
 +
<pre>sudo shred -u /etc/ssh/*_key /etc/ssh/*_key.pub
 +
sudo find / -name "authorized_keys" -exec rm -f {} \;
 +
rm -rf ~/.ssh
 +
shred -u ~/.*history
 +
sudo find /root/.*history /home/*/.*history -exec rm -f {} \;
 +
history -w
 +
history -c
 +
</pre>
 +
These commands do the following:
 +
# Remove SSH host key pairs
 +
# Remove SSH authorized keys
 +
# Remove ssh
 +
# Delete shell history
 +
 
 +
== Create the AMI ==
 +
 
 +
Once your instance is all ready with everything you want, create the AMI.
 +
 
 +
In your browser at the EC2 Management Console do the following:
 +
# Select the running instance
 +
# Right click, <code>Create Image</code>
 +
# Enter name & Description
 +
# Ensure volume size is correct
 +
# Mark delete on terminate
 +
#:This will take several minutes to complete.
 +
#:In the EC2 Dashboard, you can monitor the progress.
 +
#:When it is done, you'll see a new AMI under the list of AMIs.
 +
# When completed, terminate your old instance
 +
 
 +
 
 +
== Older/Additional Instructions ==
 +
=== Install the Software ===
    
'''(1)''' There are a number of additional Debian packages that you may well need, so we make
 
'''(1)''' There are a number of additional Debian packages that you may well need, so we make
Line 130: Line 222:  
Run the tests to be sure everything is OK.
 
Run the tests to be sure everything is OK.
   −
== Configure the Host to be Usable ==
+
=== Configure the Host to be Usable ===
    
It is useful to configure /etc/rc.local to do most things you need at boot time.
 
It is useful to configure /etc/rc.local to do most things you need at boot time.
Line 140: Line 232:  
<code>
 
<code>
 
ubuntu@ip-10-254-60-210:~$ sudo more /etc/rc.local
 
ubuntu@ip-10-254-60-210:~$ sudo more /etc/rc.local
  #!/bin/sh  
+
#!/bin/sh  
  USER=ubuntu
+
#
  THOUSANDG=/mnt/1000g
+
# rc.local
  FILES3=passwd-s3fs
+
#
  S3ERR=/tmp/s3fs.err
+
# This script is executed at the end of each multiuser runlevel.
 +
# Make sure that the script will "exit 0" on success or any other
 +
# value on error.
 +
#
 +
# In order to enable or disable this script just change the execution
 +
# bits.
 +
#
 +
# By default this script does nothing.
 +
USER=ubuntu
 +
THOUSANDG=/mnt/1000g
 +
FILES3=/etc/passwd-s3fs     # Where s3fs access info will live
 +
S3ERR=/tmp/s3fs.err
 +
#  These are needed for s3fs access
 +
AWSACCESSKEYID=AKIAxxxxxxZ3YCZF2Q
 +
AWSSECRETACCESSKEY=ft1eJa3WxxxxxxxNlbA08x/G8iMqkMIkJjFCIGf
 
   
 
   
  #    Set up for GotCloud    Assumes /dev/xvdf has reference files for GotCloud
  −
  mkdir -p /gotcloud
  −
  mount /dev/xvdf /gotcloud
  −
  if [ ! -d /gotcloud/gotcloud.ref ]; then
  −
    echo "#######################################################"
  −
    echo "#  GotCloud is not set up on /gotcloud"
  −
    echo "#######################################################"
  −
  fi
   
   
 
   
  #    Setup 1000g access by s3fs
+
#    Check that we have swap set up
  usermod -aG fuse $USER
+
  a=`swapon -s | grep -v File`
  echo 'AKIAIW5TQEUWZ3YCZF2Q:ft1eJa3WGzNE8iitNlbA08x/G8iMqkMIkJjFCIGf' > /etc/$FILES3
+
if [ "$a" = "" ]; then
  chown root.root /etc/$FILES3
+
  echo "#######################################################"
  chmod 640 /etc/$FILES3
+
  echo "#  You have no SWAP file set up"
  mkdir -p $THOUSANDG
+
  echo ""
  chown $USER.$USER $THOUSANDG
+
  echo "#  swap=/mnt/swapfile"
  #  It is tempting to use caching with -o use_cache=/tmp 1000genomes
+
  echo "#  sudo dd if=/dev/zero of=$swap bs=1073741824 count=20"
  #  But s3fs cache is exceedingly dumb and does not use a least recently used
+
  echo "#  sudo chown root:root $swap"
  #  mechanism -- which will guarantee your root volume will fill up
+
  echo "#  sudo mkswap $swap"
  /usr/local/bin/s3fs -o allow_other 1000genomes $THOUSANDG > $S3ERR 2>&1
+
  echo "#  sudo chmod 0600 $swap"
  if [ ! -r $THOUSANDG/alignment.index ]; then
+
  echo "#  sudo swapon $swap"
    echo "#######################################################" >> $S3ERR
+
  echo ""
    echo "#  1000genomes is not set up on $THOUSANDG" >> $S3ERR
+
  echo "#  If need be, add to /etc/fstab"
    echo "#######################################################" >> $S3ERR
+
  echo "#  echo "$swap  none swap sw  0  0" >> /etc/fstab"
  fi
+
  echo "#######################################################"
  df -h
+
fi
 
   
 
   
  #   Make sure we have a swap file
+
#   Set up for GotCloud
  a=`swapon -s | grep -v Filename'
+
gc=/gotcloud.mnt
  if [ "$a" = "" ]; then
+
if [ ! -r $gc/release_version.txt ]; then
    echo "#######################################################"
+
  mkdir -p $gc
    echo "#  You have no SWAP file set up"
+
  mount /dev/xvdg $gc
    echo "#"
+
  if [ -d $gc/gotcloud.ref ]; then
    echo "swap=/mnt/swapfile"
+
    echo "#######################################################"
    echo "#  sudo dd if=/dev/zero of=$swap bs=1073741824 count=20"
+
    echo "#  GotCloud is set up on $gc"
    echo "# sudo chown root:root $swap"
+
    echo "#######################################################"
    echo "# sudo mkswap $swap"
+
  fi
    echo "sudo chmod 0600 $swap"
+
fi
    echo "# sudo swapon $swap"
+
    echo "#"
+
#   Set up access to S3 storage as normal filesystem
    echo "# If need be, add to /etc/fstab"
+
  echo "${AWSACCESSKEYID}:$AWSSECRETACCESSKEY" > $FILES3
    echo "# echo "$swap  none swap sw  0  0" >> /etc/fstab"
+
  chown root.root $FILES3
    echo "#######################################################"
+
chmod 640 $FILES3
  fi
+
 
+
  usermod -aG fuse $USER
  exit 0
+
</code>
+
#   Setup 1000genomes
 
+
  mkdir -p $THOUSANDG
== Create the AMI ==
+
  if [ ! -r $THOUSANDG/release ]; then
 
+
  chown $USER.$USER $THOUSANDG
Once your instance is all ready with the files you want, swap space etc, then create the AMI.
+
  /usr/local/bin/s3fs -o allow_other 1000genomes $THOUSANDG > $S3ERR 2>&1
In your browser at the EC2 Management Console do the following:
+
  if [ ! -r $THOUSANDG/alignment.index ]; then
 
+
    echo "#######################################################"
<code>
+
    echo "#   1000genomes is not set up on $THOUSANDG"
  Create Image
+
    echo "#   See S3FS errors in $S3ERR"
    Image Name  GotCLoud 1.06
+
    echo "#######################################################"
    Image Description: From CSG at University of Michigan
+
  fi
    Volume Size: 30GB
+
  df -h
    Take defaults otherwise
+
  fi
</code>
+
  exit 0
 
  −
This will take several minutes to complete.
  −
In the EC2 Dashboard, you can monitor the progress.
  −
When it is done, you'll see a new AMI under the list of AMIs.
  −
 
  −
Your new AMI should look pretty much like this:
  −
 
  −
<code>
  −
  AMI: Ubuntu Cloud Guest AMI ID ami-3d4ff254 (x86_64)
  −
  Name: Ubuntu Server 12.04.1 LTS
  −
  Description: Ubuntu Server 12.04.1 LTS with support available from Canonical (http://www.ubuntu.com/cloud/services).
  −
  Number of Instances: 1
  −
  Availability Zone: No Preference
  −
  Instance Type: Micro (t1.micro)
  −
  Instance Class: On Demand Edit Instance Details
  −
  EBS-Optimized: No
  −
  Monitoring: Disabled Termination Protection: Disabled
  −
  Tenancy: Default
  −
  Kernel ID: Use Default Shutdown Behavior: Stop
  −
  RAM Disk ID: Use Default
  −
  Network Interfaces:
  −
  Secondary IP Addresses:
  −
  User Data:
  −
  IAM Role: Edit Advanced Details
  −
  Key Pair Name: CSG Edit Key Pair
  −
  Security Group(s): sg-a098e9c8 Edit Firewall
   
</code>
 
</code>
   −
== Test the new AMI ==
+
=== Test the new AMI ===
    
Launch a new AMI instance and check that files are in the correct places.
 
Launch a new AMI instance and check that files are in the correct places.

Navigation menu