Line 4: |
Line 4: |
| | | |
| The following are notes taken when creating the Amazon Machine Instance used for the CSG pipeline process. | | The following are notes taken when creating the Amazon Machine Instance used for the CSG pipeline process. |
| + | |
| These notes assume you have already created an EC2 account and have the certificates and keys set up properly. | | These notes assume you have already created an EC2 account and have the certificates and keys set up properly. |
| | | |
− | '''Launch an instance'''
| + | |
| + | == Create new GotCloud AMI from StarCluster AMI == |
| + | === Launch an instance === |
| | | |
| <code> | | <code> |
Line 13: |
Line 16: |
| | | |
| Pay attention to the region you are using, at least for now it seems any StarCluster activity must be in '''us-east-1'''. | | Pay attention to the region you are using, at least for now it seems any StarCluster activity must be in '''us-east-1'''. |
− | Launch a new instance which we will use to set up the software and ultimately save it as an AMI. | + | |
| + | Launch a new instance starting from a StarCluster AMI. We will use set up the software on this instance and ultimately save it as an AMI. |
| + | |
| + | # <code>EC2 DashBoard -> Launch Instance</code> |
| + | # Select: <code>Community AMIs</code> |
| + | ## Enter in the search box: <code>starcluster-base-ubuntu</code> |
| + | ## Select: <code>starcluster-base-ubuntu-12.04-x86_64 - ami-765b3e1f</code> |
| + | # Select the Instance Type: <code>Compute optimized c3.2xlarge</code> |
| + | #* You can use a smaller/cheaper machine - I originaly used t1.micro, but I found things go so much faster with a larger machine. |
| + | # Click: <code>Review and Launch</code> |
| + | ## Select: <code>Make General Purpose (SSD) the boot volume for this instance.</code> |
| + | ## Select: <code>Next</code> |
| + | # Scroll down to the <code>Storage</code> section |
| + | # Click: <code>Edit storage</code> |
| + | ## Update the Size: <code>30</code> |
| + | ##* We use 30G to fit the GotCloud code and reference files. Make it larger if you want additional space. |
| + | ## Click: <code>Review and Launch</code> |
| + | # Click: <code>Launch</code> |
| + | # Select the key/pair you want to use & Launch |
| + | |
| + | === Setup the instance with GotCloud === |
| + | This assumes you have already logged onto the instance. |
| + | |
| + | # Get the latest version of GotCloud: |
| + | #* Multiples ways to do this, one way is to do: |
| + | #*# <code>sudo git clone https://github.com/statgen/gotcloud.git</code> |
| + | # Download cmake (required to build premo) |
| + | #*<code>sudo apt-get update</code> |
| + | #*<code>sudo apt-get upgrade</code> (takes a while, may be able to skip this step) |
| + | #*<code>sudo apt-get install cmake</code> |
| + | ## Build the source (if you obtained the source code). |
| + | ### <code>cd gotcloud/src</code> |
| + | ### <code>sudo make</code> |
| + | ###* Specify <code>-j #</code> based on the number of CPUs your instance has, if more than 1 |
| + | ### <code>cd</code> |
| + | # Get the reference files |
| + | ## wget ftp://anonymous@share.sph.umich.edu/gotcloud/ref/h37-db135-v3.tgz |
| + | # Untar: <code>tar xvf h37-db135-v3.tgz</code> |
| + | # Move reference to gotcloud directory: <code>sudo mv gotcloud.ref gotcloud</code> |
| + | # Remove tar file: <code>rm h37-db135-v3.tgz</code> |
| + | # Set the paths, by updating .profile: <code>vi .profile</code> |
| + | #* <code>i</code> |
| + | #: <pre>if [ -d "$HOME/gotcloud" ] ; then PATH="$HOME/gotcloud:$PATH" fi if [ -d "$HOME/gotcloud/bin" ] ; then PATH="$HOME/gotcloud/bin:$PATH" fi if [ -d "$HOME/gotcloud/scripts" ] ; then PATH="$HOME/gotcloud/scripts:$PATH" fi</pre> |
| + | #* <code>ESC</code> |
| + | #* <code>:q</code> |
| + | |
| + | === Set Up Swap Space === |
| + | |
| + | Issue the command '''swapon -s''' to see if there is swap space. |
| + | If there is only a header line, you need to add a swap file like this: |
| | | |
| <code> | | <code> |
− | EC2 DashBoard -> Launch Instance | + | df -h # Be sure there's enough space, decide on swap size |
− | Class Wizard | + | # Create a file /swap to use (assuming / is large enough) |
− | Ubuntu Server 12.04.1 LTS 64 bit | + | sudo bash # Run these commands as root |
− | Instance type -> Micro, EC2, no preference # Memory size does not matter | + | swap=/swap |
− | Advanced Instance Options (take defaults) | + | dd if=/dev/zero of=$swap bs=524288 count=16384 # 8GB swap on t1.micro 15G=bs=1073741824 count=15 |
− | Storage Device Configuration -> Edit | + | chown root:root $swap |
− | Change volume to 30G -> Save -> Continue # Storage size does not matter | + | mkswap $swap |
− | Key Name = GotCloud 1.06a | + | chmod 0600 $swap |
− | Create Key/Pair if you need to, Name the PEM and save the pem file for access by ssh | + | swapon $swap |
− | Choose a Security Group (take default)
| + | echo "$swap none swap sw 0 0" >> /etc/fstab |
− | Launch | + | |
− | No need to Create Status Check Alarms
| + | swapon -s # Should show the swap device |
− | No need to Create EBS Volumes
| |
| </code> | | </code> |
| | | |
| + | === Cleanup the instance for creating an AMI === |
| + | # Go to : [[#Cleanup Instance for AMI Creation|Cleanup Instance for AMI Creation]] |
| + | |
| + | === Create the AMI === |
| + | # Go to : [[#Create the AMI|Create the AMI]] |
| | | |
− | '''Install Software'''
| |
| | | |
− | There are a number of additional Debian packages that you may well need, so we make | + | == Update the GotCloud AMI == |
| + | # Start an instance of the current GotCloud AMI |
| + | #* Suggest an instance with some CPU so you can parallelize the "make" call. |
| + | # Login as ubuntu |
| + | # <code>cd gotcloud</code> |
| + | # <code>sudo git pull</code> |
| + | # <code>cd gotcloud/src</code> |
| + | # <code>sudo make</code> |
| + | #* Specify <code>-j #</code> based on the number of CPUs your instance has |
| + | # <code>cd</code> |
| + | # Go to : [[#Create the AMI|Create the AMI]] |
| + | |
| + | |
| + | ==Cleanup Instance for AMI Creation== |
| + | First time from generic/starcluster AMI |
| + | # Disable password-based logins for root |
| + | ## Open /etc/ssh/sshd_config |
| + | ## Change <code>PermitRootLogin yes</code> to <code>PermitRootLogin without-password</code> |
| + | # Disable root access |
| + | ## <code> sudo passwd -l root</code> |
| + | |
| + | |
| + | Each time we generate a new AMI, run: |
| + | <pre>sudo shred -u /etc/ssh/*_key /etc/ssh/*_key.pub |
| + | sudo find / -name "authorized_keys" -exec rm -f {} \; |
| + | rm -rf ~/.ssh |
| + | shred -u ~/.*history |
| + | sudo find /root/.*history /home/*/.*history -exec rm -f {} \; |
| + | history -w |
| + | history -c |
| + | </pre> |
| + | These commands do the following: |
| + | # Remove SSH host key pairs |
| + | # Remove SSH authorized keys |
| + | # Remove ssh |
| + | # Delete shell history |
| + | |
| + | == Create the AMI == |
| + | |
| + | Once your instance is all ready with everything you want, create the AMI. |
| + | |
| + | In your browser at the EC2 Management Console do the following: |
| + | # Select the running instance |
| + | # Right click, <code>Create Image</code> |
| + | # Enter name & Description |
| + | # Ensure volume size is correct |
| + | # Mark delete on terminate |
| + | #:This will take several minutes to complete. |
| + | #:In the EC2 Dashboard, you can monitor the progress. |
| + | #:When it is done, you'll see a new AMI under the list of AMIs. |
| + | # When completed, terminate your old instance |
| + | |
| + | |
| + | == Older/Additional Instructions == |
| + | === Install the Software === |
| + | |
| + | '''(1)''' There are a number of additional Debian packages that you may well need, so we make |
| sure they are all installed. | | sure they are all installed. |
| | | |
| <code> | | <code> |
| + | sudo apt-get update |
| + | sudo apt-get upgrade # Apply maintenance |
| + | |
| sudo apt-get install java-common default-jre make libssl0.9.8 | | sudo apt-get install java-common default-jre make libssl0.9.8 |
− | sudo apt-get install libnet-amazon-ec2-perl | + | sudo apt-get install libnet-amazon-ec2-perl s3cmd |
| sudo apt-get install make g++ libcurl4-openssl-dev libssl-dev libxml2-dev libfuse-dev | | sudo apt-get install make g++ libcurl4-openssl-dev libssl-dev libxml2-dev libfuse-dev |
| </code> | | </code> |
| | | |
− | '''Install the GotCloud Software''' | + | '''(2)''' '''S3fs''' allows one to access S3 storage as a conventional file system. |
| + | This can be quite handy, if it is set up properly. |
| + | Our recent experience is that the 1000 Genomes data is has many files with incorrect permissions. |
| + | Still if you're lucky, your data will be useful. |
| + | Install the software like this: |
| | | |
− | Follow the instructions to install a Debian package [[Pipeline Debian Package|debian package]]
| + | <code> |
− | Run the tests to be sure everything is OK.
| + | mkdir -p ~/src |
| + | cd ~/src |
| + | wget http://s3fs.googlecode.com/files/s3fs-1.68.tar.gz |
| + | tar xzvf s3fs-1.68.tar.gz |
| + | cd s3fs* |
| + | ./configure |
| + | sudo make install |
| + | </code> |
| | | |
− | '''Create the AMI''' | + | '''(3)''' Configure s3cmd. This will ask for your AWS ID and Secret Key. If creates a file in ~/.s3cfg |
− | | |
− | In your browser at the EC2 Management Console do the following:
| |
| | | |
| <code> | | <code> |
− | Create Image | + | s3cmd --configure |
− | Image Name csg-biopipe_instance | + | |
− | Image Description: Image for CSG Biopipe instance
| + | Enter new values or accept defaults in brackets with Enter. |
− | Volume Size: 30GB
| + | Refer to user manual for detailed description of all options. |
− | Take defaults otherwise
| + | |
| + | Access key and Secret key are your identifiers for Amazon S3 |
| + | Access Key: AKI1234QEUWZ3YCZF2Q |
| + | Secret Key: ft1eJa1234NE8iitNlbA08x/G8iMqkMI1234IGf |
| + | |
| + | Encryption password is used to protect your files from reading |
| + | by unauthorized persons while in transfer to S3 |
| + | Encryption password: password_you_do_not_need_to_know |
| + | Path to GPG program [/usr/bin/gpg]: |
| + | |
| + | When using secure HTTPS protocol all communication with Amazon S3 |
| + | servers is protected from 3rd party eavesdropping. This method is |
| + | slower than plain HTTP and can't be used if you're behind a proxy |
| + | Use HTTPS protocol [No]: |
| + | |
| + | On some networks all internet access must go through a HTTP proxy. |
| + | Try setting it here if you can't conect to S3 directly |
| + | HTTP Proxy server name: |
| + | |
| + | New settings: |
| + | Access Key: AKI1234QEUWZ3YCZF2Q |
| + | Secret Key: ft1eJa1234NE8iitNlbA08x/G8iMqkMI1234IGf |
| + | Encryption password: password_you_do_not_need_to_know |
| + | Path to GPG program: /usr/bin/gpg |
| + | Use HTTPS protocol: False |
| + | HTTP Proxy server name: |
| + | HTTP Proxy server port: 0 |
| + | |
| + | Test access with supplied credentials? [Y/n] |
| + | Please wait... |
| + | Success. Your access key and secret key worked fine :-) |
| + | |
| + | Now verifying that encryption works... |
| + | Success. Encryption and decryption worked fine :-) |
| + | |
| + | Save settings? [y/N] y |
| + | Configuration saved to '/home/ubuntu/.s3cfg' |
| </code> | | </code> |
| | | |
− | This will take several minutes to complete.
| + | '''(4)''' Follow the instructions to install the [[Pipeline Debian Package|'''GotCloud Debian packages''']] |
− | In the EC2 Dashboard, you can monitor the progress.
| + | Run the tests to be sure everything is OK. |
− | When it is done, you'll see a new AMI under the list of AMIs.
| + | |
| + | === Configure the Host to be Usable === |
| | | |
− | Your new AMI should look pretty much like this:
| + | It is useful to configure /etc/rc.local to do most things you need at boot time. |
| + | There are many other ways to do this, but here's one simple way - create the |
| + | file /etc/rc.local (as root). |
| + | The following example sets up access details for '''s3cmd''' and '''s3fs''' |
| + | (use your own credentials). |
| | | |
| <code> | | <code> |
− | AMI: Ubuntu Cloud Guest AMI ID ami-3d4ff254 (x86_64)
| + | ubuntu@ip-10-254-60-210:~$ sudo more /etc/rc.local |
− | Name: Ubuntu Server 12.04.1 LTS
| + | #!/bin/sh |
− | Description: Ubuntu Server 12.04.1 LTS with support available from Canonical (http://www.ubuntu.com/cloud/services).
| + | # |
− | Number of Instances: 1 | + | # rc.local |
− | Availability Zone: No Preference
| + | # |
− | Instance Type: Micro (t1.micro)
| + | # This script is executed at the end of each multiuser runlevel. |
− | Instance Class: On Demand Edit Instance Details
| + | # Make sure that the script will "exit 0" on success or any other |
− | EBS-Optimized: No
| + | # value on error. |
− | Monitoring: Disabled Termination Protection: Disabled
| + | # |
− | Tenancy: Default
| + | # In order to enable or disable this script just change the execution |
− | Kernel ID: Use Default Shutdown Behavior: Stop
| + | # bits. |
− | RAM Disk ID: Use Default
| + | # |
− | Network Interfaces:
| + | # By default this script does nothing. |
− | Secondary IP Addresses: | + | USER=ubuntu |
− | User Data: | + | THOUSANDG=/mnt/1000g |
− | IAM Role: Edit Advanced Details
| + | FILES3=/etc/passwd-s3fs # Where s3fs access info will live |
− | Key Pair Name: CSG Edit Key Pair
| + | S3ERR=/tmp/s3fs.err |
− | Security Group(s): sg-a098e9c8 Edit Firewall
| + | # These are needed for s3fs access |
| + | AWSACCESSKEYID=AKIAxxxxxxZ3YCZF2Q |
| + | AWSSECRETACCESSKEY=ft1eJa3WxxxxxxxNlbA08x/G8iMqkMIkJjFCIGf |
| + | |
| + | |
| + | # Check that we have swap set up |
| + | a=`swapon -s | grep -v File` |
| + | if [ "$a" = "" ]; then |
| + | echo "#######################################################" |
| + | echo "# You have no SWAP file set up" |
| + | echo "" |
| + | echo "# swap=/mnt/swapfile" |
| + | echo "# sudo dd if=/dev/zero of=$swap bs=1073741824 count=20" |
| + | echo "# sudo chown root:root $swap" |
| + | echo "# sudo mkswap $swap" |
| + | echo "# sudo chmod 0600 $swap" |
| + | echo "# sudo swapon $swap" |
| + | echo "" |
| + | echo "# If need be, add to /etc/fstab" |
| + | echo "# echo "$swap none swap sw 0 0" >> /etc/fstab" |
| + | echo "#######################################################" |
| + | fi |
| + | |
| + | # Set up for GotCloud |
| + | gc=/gotcloud.mnt |
| + | if [ ! -r $gc/release_version.txt ]; then |
| + | mkdir -p $gc |
| + | mount /dev/xvdg $gc |
| + | if [ -d $gc/gotcloud.ref ]; then |
| + | echo "#######################################################" |
| + | echo "# GotCloud is set up on $gc" |
| + | echo "#######################################################" |
| + | fi |
| + | fi |
| + | |
| + | # Set up access to S3 storage as normal filesystem |
| + | echo "${AWSACCESSKEYID}:$AWSSECRETACCESSKEY" > $FILES3 |
| + | chown root.root $FILES3 |
| + | chmod 640 $FILES3 |
| + | |
| + | usermod -aG fuse $USER |
| + | |
| + | # Setup 1000genomes |
| + | mkdir -p $THOUSANDG |
| + | if [ ! -r $THOUSANDG/release ]; then |
| + | chown $USER.$USER $THOUSANDG |
| + | /usr/local/bin/s3fs -o allow_other 1000genomes $THOUSANDG > $S3ERR 2>&1 |
| + | if [ ! -r $THOUSANDG/alignment.index ]; then |
| + | echo "#######################################################" |
| + | echo "# 1000genomes is not set up on $THOUSANDG" |
| + | echo "# See S3FS errors in $S3ERR" |
| + | echo "#######################################################" |
| + | fi |
| + | df -h |
| + | fi |
| + | exit 0 |
| </code> | | </code> |
| | | |
− | '''Test the new AMI'''
| + | === Test the new AMI === |
| | | |
| Launch a new AMI instance and check that files are in the correct places. | | Launch a new AMI instance and check that files are in the correct places. |
Line 97: |
Line 316: |
| Advanced Instance Options (take defaults) | | Advanced Instance Options (take defaults) |
| Storage Device Configuration -> Edit | | Storage Device Configuration -> Edit |
− | Change volume to 30G or whatever -> Continue # Defaults are OK | + | Change volume to 30G or larger -> Continue # Defaults are OK |
| Instance Details | | Instance Details |
| Key Name = test of instance | | Key Name = test of instance |