Difference between revisions of "Creating an AMI on EC2"

From Genome Analysis Wiki
Jump to navigationJump to search
 
(12 intermediate revisions by the same user not shown)
Line 7: Line 7:
 
These notes assume you have already created an EC2 account and have the certificates and keys set up properly.
 
These notes assume you have already created an EC2 account and have the certificates and keys set up properly.
  
== Launch an instance ==
+
 
 +
== Create new GotCloud AMI from StarCluster AMI ==
 +
=== Launch an instance ===
  
 
<code>
 
<code>
Line 21: Line 23:
 
## Enter in the search box: <code>starcluster-base-ubuntu</code>
 
## Enter in the search box: <code>starcluster-base-ubuntu</code>
 
## Select: <code>starcluster-base-ubuntu-12.04-x86_64 - ami-765b3e1f</code>
 
## Select: <code>starcluster-base-ubuntu-12.04-x86_64 - ami-765b3e1f</code>
# Select the Instance Type: <code>Micro t1.micro</code>
+
# Select the Instance Type: <code>Compute optimized c3.2xlarge</code>
#* Since we are just using this instance to setup the system, pick the smallest/cheapest machine.
+
#* You can use a smaller/cheaper machine - I originaly used t1.micro, but I found things go so much faster with a larger machine.
 
# Click: <code>Review and Launch</code>
 
# Click: <code>Review and Launch</code>
 
## Select: <code>Make General Purpose (SSD) the boot volume for this instance.</code>
 
## Select: <code>Make General Purpose (SSD) the boot volume for this instance.</code>
Line 28: Line 30:
 
# Scroll down to the <code>Storage</code> section
 
# Scroll down to the <code>Storage</code> section
 
# Click: <code>Edit storage</code>
 
# Click: <code>Edit storage</code>
## Update the Size: <code>20</code>
+
## Update the Size: <code>30</code>
##* We use 20G to fit the GotCloud code and reference files.  Make it larger if you want additional space.
+
##* We use 30G to fit the GotCloud code and reference files.  Make it larger if you want additional space.
 
## Click: <code>Review and Launch</code>
 
## Click: <code>Review and Launch</code>
 
# Click: <code>Launch</code>
 
# Click: <code>Launch</code>
 
# Select the key/pair you want to use & Launch
 
# Select the key/pair you want to use & Launch
  
== Setup the instance with GotCloud ==
+
=== Setup the instance with GotCloud ===
 
This assumes you have already logged onto the instance.
 
This assumes you have already logged onto the instance.
  
 
# Get the latest version of GotCloud:
 
# Get the latest version of GotCloud:
 
#* Multiples ways to do this, one way is to do:
 
#* Multiples ways to do this, one way is to do:
#*# <code>git clone https://github.com/statgen/gotcloud.git gc_src</code>
+
#*# <code>sudo git clone https://github.com/statgen/gotcloud.git</code>
 
# Download cmake (required to build premo)
 
# Download cmake (required to build premo)
 
#*<code>sudo apt-get update</code>
 
#*<code>sudo apt-get update</code>
Line 45: Line 47:
 
#*<code>sudo apt-get install cmake</code>
 
#*<code>sudo apt-get install cmake</code>
 
## Build the source (if you obtained the source code).
 
## Build the source (if you obtained the source code).
### <code>cd gc_src/src</code>
+
### <code>cd gotcloud/src</code>
### <code>make</code>
+
### <code>sudo make</code>
### <code>cd ..</code>
+
###* Specify <code>-j #</code> based on the number of CPUs your instance has, if more than 1
## Generate the installation.
 
### <code>cd gc_src/</code>
 
### <code>./debian/makedeb.sh bin =</code>
 
 
### <code>cd</code>
 
### <code>cd</code>
# Install: <code>sudo dpkg -i gc_src/gotcloud-bin_1.14.3_amd64.deb</code>
 
# Move installation to home so will be on all nodes: <code>sudo mv /usr/local/gotcloud .</code>
 
# Remove source: <code>rm -rf gc_src/</code> (required since not enough storage space.)
 
 
# Get the reference files
 
# Get the reference files
 
## wget ftp://anonymous@share.sph.umich.edu/gotcloud/ref/h37-db135-v3.tgz
 
## wget ftp://anonymous@share.sph.umich.edu/gotcloud/ref/h37-db135-v3.tgz
 
# Untar: <code>tar xvf h37-db135-v3.tgz</code>
 
# Untar: <code>tar xvf h37-db135-v3.tgz</code>
 +
# Move reference to gotcloud directory: <code>sudo mv gotcloud.ref gotcloud</code>
 +
# Remove tar file: <code>rm h37-db135-v3.tgz</code>
 +
# Set the paths, by updating .profile: <code>vi .profile</code>
 +
#* <code>i</code>
 +
#: <pre>if [ -d "$HOME/gotcloud" ] ; then&#10;    PATH="$HOME/gotcloud:$PATH"&#10;fi&#10;if [ -d "$HOME/gotcloud/bin" ] ; then&#10;    PATH="$HOME/gotcloud/bin:$PATH"&#10;fi&#10;if [ -d "$HOME/gotcloud/scripts" ] ; then&#10;    PATH="$HOME/gotcloud/scripts:$PATH"&#10;fi</pre>
 +
#* <code>ESC</code>
 +
#* <code>:q</code>
  
 
+
=== Set Up Swap Space ===
 
 
=== Install GotCloud in ubuntu home directory ===
 
 
 
Set aliases in .bashrc
 
 
 
== Create Image ==
 
 
 
== Set Up Swap Space ==
 
  
 
Issue the command '''swapon -s''' to see if there is swap space.
 
Issue the command '''swapon -s''' to see if there is swap space.
Line 87: Line 82:
 
</code>
 
</code>
  
== Install the Software ==
+
=== Cleanup the instance for creating an AMI ===
 +
# Go to : [[#Cleanup Instance for AMI Creation|Cleanup Instance for AMI Creation]]
 +
 
 +
=== Create the AMI ===
 +
# Go to : [[#Create the AMI|Create the AMI]]
 +
 
 +
 
 +
== Update the GotCloud AMI ==
 +
# Start an instance of the current GotCloud AMI
 +
#* Suggest an instance with some CPU so you can parallelize the "make" call.
 +
# Login as ubuntu
 +
# <code>cd gotcloud</code>
 +
# <code>sudo git pull</code>
 +
# <code>cd gotcloud/src</code>
 +
# <code>sudo make</code>
 +
#* Specify <code>-j #</code> based on the number of CPUs your instance has
 +
# <code>cd</code>
 +
# Go to : [[#Create the AMI|Create the AMI]]
 +
 
 +
 
 +
==Cleanup Instance for AMI Creation==
 +
First time from generic/starcluster AMI
 +
# Disable password-based logins for root
 +
## Open /etc/ssh/sshd_config
 +
## Change <code>PermitRootLogin yes</code> to <code>PermitRootLogin without-password</code>
 +
# Disable root access
 +
## <code> sudo passwd -l root</code>
 +
 
 +
 
 +
Each time we generate a new AMI, run:
 +
<pre>sudo shred -u /etc/ssh/*_key /etc/ssh/*_key.pub
 +
sudo find / -name "authorized_keys" -exec rm -f {} \;
 +
rm -rf ~/.ssh
 +
shred -u ~/.*history
 +
sudo find /root/.*history /home/*/.*history -exec rm -f {} \;
 +
history -w
 +
history -c
 +
</pre>
 +
These commands do the following:
 +
# Remove SSH host key pairs
 +
# Remove SSH authorized keys
 +
# Remove ssh
 +
# Delete shell history
 +
 
 +
== Create the AMI ==
 +
 
 +
Once your instance is all ready with everything you want, create the AMI.
 +
 
 +
In your browser at the EC2 Management Console do the following:
 +
# Select the running instance
 +
# Right click, <code>Create Image</code>
 +
# Enter name & Description
 +
# Ensure volume size is correct
 +
# Mark delete on terminate
 +
#:This will take several minutes to complete.
 +
#:In the EC2 Dashboard, you can monitor the progress.
 +
#:When it is done, you'll see a new AMI under the list of AMIs.
 +
# When completed, terminate your old instance
 +
 
 +
 
 +
== Older/Additional Instructions ==
 +
=== Install the Software ===
  
 
'''(1)''' There are a number of additional Debian packages that you may well need, so we make
 
'''(1)''' There are a number of additional Debian packages that you may well need, so we make
Line 166: Line 222:
 
Run the tests to be sure everything is OK.
 
Run the tests to be sure everything is OK.
  
== Configure the Host to be Usable ==
+
=== Configure the Host to be Usable ===
  
 
It is useful to configure /etc/rc.local to do most things you need at boot time.
 
It is useful to configure /etc/rc.local to do most things you need at boot time.
Line 250: Line 306:
 
</code>
 
</code>
  
== Create the AMI ==
+
=== Test the new AMI ===
 
 
Once your instance is all ready with the files you want, swap space etc, then create the AMI.
 
In your browser at the EC2 Management Console do the following:
 
 
 
<code>
 
  Create Image
 
    Image Name  GotCLoud 1.06
 
    Image Description:  From CSG at University of Michigan
 
    Volume Size:  30GB
 
    Take defaults otherwise
 
</code>
 
 
 
This will take several minutes to complete.
 
In the EC2 Dashboard, you can monitor the progress.
 
When it is done, you'll see a new AMI under the list of AMIs.
 
 
 
Your new AMI should look pretty much like this:
 
 
 
<code>
 
  AMI: Ubuntu Cloud Guest AMI ID ami-3d4ff254 (x86_64)
 
  Name: Ubuntu Server 12.04.1 LTS
 
  Description: Ubuntu Server 12.04.1 LTS with support available from Canonical (http://www.ubuntu.com/cloud/services).
 
  Number of Instances: 1
 
  Availability Zone: No Preference
 
  Instance Type: Micro (t1.micro)
 
  Instance Class: On Demand Edit Instance Details
 
  EBS-Optimized: No
 
  Monitoring: Disabled Termination Protection: Disabled
 
  Tenancy: Default
 
  Kernel ID: Use Default Shutdown Behavior: Stop
 
  RAM Disk ID: Use Default
 
  Network Interfaces:
 
  Secondary IP Addresses:
 
  User Data:
 
  IAM Role: Edit Advanced Details
 
  Key Pair Name: CSG Edit Key Pair
 
  Security Group(s): sg-a098e9c8 Edit Firewall
 
</code>
 
 
 
== Test the new AMI ==
 
  
 
Launch a new AMI instance and check that files are in the correct places.
 
Launch a new AMI instance and check that files are in the correct places.

Latest revision as of 14:54, 14 October 2014

Notes About Creating a New EC2 AMI

Back to parent: GotCloud

The following are notes taken when creating the Amazon Machine Instance used for the CSG pipeline process.

These notes assume you have already created an EC2 account and have the certificates and keys set up properly.


Create new GotCloud AMI from StarCluster AMI

Launch an instance

 Login to https://console.aws.amazon.com/ec2       # EC2 Management Console

Pay attention to the region you are using, at least for now it seems any StarCluster activity must be in us-east-1.

Launch a new instance starting from a StarCluster AMI. We will use set up the software on this instance and ultimately save it as an AMI.

  1. EC2 DashBoard -> Launch Instance
  2. Select: Community AMIs
    1. Enter in the search box: starcluster-base-ubuntu
    2. Select: starcluster-base-ubuntu-12.04-x86_64 - ami-765b3e1f
  3. Select the Instance Type: Compute optimized c3.2xlarge
    • You can use a smaller/cheaper machine - I originaly used t1.micro, but I found things go so much faster with a larger machine.
  4. Click: Review and Launch
    1. Select: Make General Purpose (SSD) the boot volume for this instance.
    2. Select: Next
  5. Scroll down to the Storage section
  6. Click: Edit storage
    1. Update the Size: 30
      • We use 30G to fit the GotCloud code and reference files. Make it larger if you want additional space.
    2. Click: Review and Launch
  7. Click: Launch
  8. Select the key/pair you want to use & Launch

Setup the instance with GotCloud

This assumes you have already logged onto the instance.

  1. Get the latest version of GotCloud:
  2. Download cmake (required to build premo)
    • sudo apt-get update
    • sudo apt-get upgrade (takes a while, may be able to skip this step)
    • sudo apt-get install cmake
    1. Build the source (if you obtained the source code).
      1. cd gotcloud/src
      2. sudo make
        • Specify -j # based on the number of CPUs your instance has, if more than 1
      3. cd
  3. Get the reference files
    1. wget ftp://anonymous@share.sph.umich.edu/gotcloud/ref/h37-db135-v3.tgz
  4. Untar: tar xvf h37-db135-v3.tgz
  5. Move reference to gotcloud directory: sudo mv gotcloud.ref gotcloud
  6. Remove tar file: rm h37-db135-v3.tgz
  7. Set the paths, by updating .profile: vi .profile
    • i
    if [ -d "$HOME/gotcloud" ] ; then
        PATH="$HOME/gotcloud:$PATH"
    fi
    if [ -d "$HOME/gotcloud/bin" ] ; then
        PATH="$HOME/gotcloud/bin:$PATH"
    fi
    if [ -d "$HOME/gotcloud/scripts" ] ; then
        PATH="$HOME/gotcloud/scripts:$PATH"
    fi
    • ESC
    • :q

Set Up Swap Space

Issue the command swapon -s to see if there is swap space. If there is only a header line, you need to add a swap file like this:

 df -h          # Be sure there's enough space, decide on swap size
 #  Create a file /swap to use (assuming / is large enough)
 sudo bash      # Run these commands as root
 swap=/swap
 dd if=/dev/zero of=$swap bs=524288 count=16384     # 8GB swap on t1.micro   15G=bs=1073741824 count=15
 chown root:root $swap
 mkswap $swap
 chmod 0600 $swap
 swapon $swap
 echo "$swap  none swap sw  0  0" >> /etc/fstab

 swapon -s       # Should show the swap device

Cleanup the instance for creating an AMI

  1. Go to : Cleanup Instance for AMI Creation

Create the AMI

  1. Go to : Create the AMI


Update the GotCloud AMI

  1. Start an instance of the current GotCloud AMI
    • Suggest an instance with some CPU so you can parallelize the "make" call.
  2. Login as ubuntu
  3. cd gotcloud
  4. sudo git pull
  5. cd gotcloud/src
  6. sudo make
    • Specify -j # based on the number of CPUs your instance has
  7. cd
  8. Go to : Create the AMI


Cleanup Instance for AMI Creation

First time from generic/starcluster AMI

  1. Disable password-based logins for root
    1. Open /etc/ssh/sshd_config
    2. Change PermitRootLogin yes to PermitRootLogin without-password
  2. Disable root access
    1. sudo passwd -l root


Each time we generate a new AMI, run:

sudo shred -u /etc/ssh/*_key /etc/ssh/*_key.pub
sudo find / -name "authorized_keys" -exec rm -f {} \;
rm -rf ~/.ssh
shred -u ~/.*history
sudo find /root/.*history /home/*/.*history -exec rm -f {} \;
history -w
history -c

These commands do the following:

  1. Remove SSH host key pairs
  2. Remove SSH authorized keys
  3. Remove ssh
  4. Delete shell history

Create the AMI

Once your instance is all ready with everything you want, create the AMI.

In your browser at the EC2 Management Console do the following:

  1. Select the running instance
  2. Right click, Create Image
  3. Enter name & Description
  4. Ensure volume size is correct
  5. Mark delete on terminate
    This will take several minutes to complete.
    In the EC2 Dashboard, you can monitor the progress.
    When it is done, you'll see a new AMI under the list of AMIs.
  6. When completed, terminate your old instance


Older/Additional Instructions

Install the Software

(1) There are a number of additional Debian packages that you may well need, so we make sure they are all installed.

 sudo apt-get update
 sudo apt-get upgrade           # Apply maintenance

 sudo apt-get install java-common default-jre make libssl0.9.8 
 sudo apt-get install libnet-amazon-ec2-perl s3cmd
 sudo apt-get install make g++ libcurl4-openssl-dev libssl-dev libxml2-dev libfuse-dev

(2) S3fs allows one to access S3 storage as a conventional file system. This can be quite handy, if it is set up properly. Our recent experience is that the 1000 Genomes data is has many files with incorrect permissions. Still if you're lucky, your data will be useful. Install the software like this:

 mkdir -p ~/src
 cd ~/src
 wget  http://s3fs.googlecode.com/files/s3fs-1.68.tar.gz
 tar xzvf s3fs-1.68.tar.gz
 cd s3fs*
 ./configure
 sudo make install

(3) Configure s3cmd. This will ask for your AWS ID and Secret Key. If creates a file in ~/.s3cfg

 s3cmd --configure

 Enter new values or accept defaults in brackets with Enter.
 Refer to user manual for detailed description of all options.

 Access key and Secret key are your identifiers for Amazon S3
 Access Key: AKI1234QEUWZ3YCZF2Q
 Secret Key: ft1eJa1234NE8iitNlbA08x/G8iMqkMI1234IGf

 Encryption password is used to protect your files from reading
 by unauthorized persons while in transfer to S3
 Encryption password: password_you_do_not_need_to_know
 Path to GPG program [/usr/bin/gpg]: 

 When using secure HTTPS protocol all communication with Amazon S3
 servers is protected from 3rd party eavesdropping. This method is
 slower than plain HTTP and can't be used if you're behind a proxy
 Use HTTPS protocol [No]: 

 On some networks all internet access must go through a HTTP proxy.
 Try setting it here if you can't conect to S3 directly
 HTTP Proxy server name: 

 New settings:
   Access Key: AKI1234QEUWZ3YCZF2Q
   Secret Key: ft1eJa1234NE8iitNlbA08x/G8iMqkMI1234IGf
   Encryption password: password_you_do_not_need_to_know
   Path to GPG program: /usr/bin/gpg
   Use HTTPS protocol: False
   HTTP Proxy server name: 
   HTTP Proxy server port: 0

 Test access with supplied credentials? [Y/n] 
 Please wait...
 Success. Your access key and secret key worked fine :-)

 Now verifying that encryption works...
 Success. Encryption and decryption worked fine :-)

 Save settings? [y/N] y
 Configuration saved to '/home/ubuntu/.s3cfg'

(4) Follow the instructions to install the GotCloud Debian packages Run the tests to be sure everything is OK.

Configure the Host to be Usable

It is useful to configure /etc/rc.local to do most things you need at boot time. There are many other ways to do this, but here's one simple way - create the file /etc/rc.local (as root). The following example sets up access details for s3cmd and s3fs (use your own credentials).

ubuntu@ip-10-254-60-210:~$ sudo more /etc/rc.local

#!/bin/sh 
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.
USER=ubuntu
THOUSANDG=/mnt/1000g
FILES3=/etc/passwd-s3fs     # Where s3fs access info will live
S3ERR=/tmp/s3fs.err
#   These are needed for s3fs access
AWSACCESSKEYID=AKIAxxxxxxZ3YCZF2Q
AWSSECRETACCESSKEY=ft1eJa3WxxxxxxxNlbA08x/G8iMqkMIkJjFCIGf


#    Check that we have swap set up
a=`swapon -s | grep -v File`
if [ "$a" = "" ]; then
  echo "#######################################################"
  echo "#   You have no SWAP file set up"
  echo ""
  echo "#  swap=/mnt/swapfile"
  echo "#  sudo dd if=/dev/zero of=$swap bs=1073741824 count=20"
  echo "#  sudo chown root:root $swap"
  echo "#  sudo mkswap $swap"
  echo "#  sudo chmod 0600 $swap"
  echo "#  sudo swapon $swap"
  echo ""
  echo "#  If need be, add to /etc/fstab"
  echo "#  echo "$swap  none swap sw  0  0" >> /etc/fstab"
  echo "#######################################################"
fi

#    Set up for GotCloud
gc=/gotcloud.mnt
if [ ! -r $gc/release_version.txt ]; then
  mkdir -p $gc
  mount /dev/xvdg $gc
  if [ -d $gc/gotcloud.ref ]; then
    echo "#######################################################"
    echo "#   GotCloud is set up on $gc"
    echo "#######################################################"
  fi
fi

#    Set up access to S3 storage as normal filesystem 
echo "${AWSACCESSKEYID}:$AWSSECRETACCESSKEY" > $FILES3
chown root.root $FILES3
chmod 640 $FILES3

usermod -aG fuse $USER

#    Setup 1000genomes
mkdir -p $THOUSANDG
if [ ! -r $THOUSANDG/release ]; then
  chown $USER.$USER $THOUSANDG
  /usr/local/bin/s3fs -o allow_other 1000genomes $THOUSANDG > $S3ERR 2>&1
  if [ ! -r $THOUSANDG/alignment.index ]; then
    echo "#######################################################"
    echo "#   1000genomes is not set up on $THOUSANDG"
    echo "#   See S3FS errors in $S3ERR"
    echo "#######################################################"
  fi
  df -h
fi
exit 0

Test the new AMI

Launch a new AMI instance and check that files are in the correct places. In the EC2 Management Console do:

 EC2 DashBoard -> AMIs -> Select CSG instance -> Launch Instance
 Launch Instances  (take defaults)
 Advanced Instance Options  (take defaults)
 Storage Device Configuration -> Edit
 Change volume to 30G or larger -> Continue     # Defaults are OK
 Instance Details
   Key Name = test of instance
 Create Key/Pair if you need to, most likely you can use one you have created
 Choose a Security Group -> sg-a098e9c8 - quick-start-1
 Review -> Launch