Managing Multi-Node EC2 Deployments with SVN, Ant and bash

I’ve recently been doing quite a bit of work with Amazon Web Services. Over the past few weeks I have been developing a simple way to manage multi-node applications using EC2. Much of this is very project-specific but the methodology could easily be applied to a wide variety of deployments. The techniques described here were inspired by this blog post. This article is intended for people who are already somewhat familiar with Amazon Web Services — for a brief introduction to S3 and EC2, view my previous blog post. If you’re ready for it, read on.

One of the most frustrating things I encountered working with EC2 is the monotony of bundling and uploading the AMI each time I made changes to my server configuration. Originally, my plan was to store the application code as part of the AMI so that when the server fired up, it was all ready to go with the latest version. I quickly realized that this was going to get out of hand, especially when we decided to deploy our beta application on EC2 and it required updates on a weekly basis. I progressed by storing the application code on S3, and downloading it manually when needed. Then I read the CodeWord blog post and I implemented the following init script.

#! /bin/sh
#
# chkconfig: 2345 90 30
# description: EC2 bootstrapping process
#
RETVAL=0
case "$1" in
start)
$0 reload
if [ $? == 0 ]; then
   /opt/myapp/scripts/start.sh
fi
RETVAL=$?
;;
stop)
/opt/myapp/scripts/stop.sh
RETVAL=$?
;;
restart)
$0 stop
$0 start
RETVAL=$?
;;
reload)
cp /opt/myapp/scripts/download.sh /tmp/download.sh
if [ $? == 0 ]; then
   /tmp/download.sh
fi
RETVAL=$?
;;
*)
echo "Usage: $0 {start|stop|restart|reload}"
exit 1
;;
esac
exit $RETVAL

This init script was added to the appropriate runlevels with chkconfig and then bundled into the AMI, along with the environment setup (Java, Tomcat, Apache, whatever the case may be). The only other piece of code that is necessary at bundle-time is the download.sh script, which tells the application how to get the deployment code and where to unpack it. The reason I’m copying to a temp directory is because I am unpacking into the /opt/myapp directory and I don’t want to overwrite that file when it’s running. The download script is built on s3cmd and looks something like this:

#!/bin/bash
deploy="web"
deploydir="/opt/myapp"
ext="tar.gz"
s3file="s3://myapp-deployments/$deploy.$ext"
ts=`date +%s`
/usr/bin/s3cmd -c /root/.s3cfg get $s3file /tmp/$ts.$ext
if [ $? == 0 ]; then
   tar xzfp /tmp/$ts.$ext -C $deploydir/ --overwrite
   if [ $? == 0 ]; then
        rm /tmp/$ts.$ext
        chmod 744 $deploydir/scripts/*
        echo "downloaded $s3file as $ts.$ext and extracted to $deploydir"
   fi
fi

This will allow me to pack up a quiver of deployment tarballs for each of my server types and stick them on a private bucket on S3. (I could even improve this by using EC2’s user-data to specify a server type so I can use one multi-purpose image. In my case, the instances are specific enough that it makes sense to maintain separate image types, with the deploy variable hard-coded.) Then I can use this download script to grab the latest code, unpack it into the myapp directory, and run the freshly-downloaded start.sh script. The script allows me to do things like start various services, create cron tasks, obtain a Dynamic DNS hostname, whatever I need to accomplish at startup time. The original author checked out his packages from subversion, I opted to go the S3 route so that I didn’t have to open up our internal SVN repository to the public, and I found that it was much quicker to grab the code from S3 than from anywhere else, meaning it was ready to go as fast as possible.

The tarballs described above contain the necessary compiled Java classes and JAR files, any worthwile bash scripts, cron task definitions, even application config files. To manage the creation of this install directory, I’m using an Ant task to throw together all the necessary files. Certainly this is nothing revolutionary, an excerpt from my build.xml file is shown below.

<target name="package-web" depends="package">
        <!-- build scripts directory -->
        <echo message="Adding service scripts..."/>
        <copy todir="${basedir}/dist/web/scripts">
            <fileset dir="${basedir}/scripts/service/web">
                <include name="*"/>
            </fileset>
        </copy>
        <!-- copy necessary JAR files -->
        <echo message="Adding JAR files..."/>
        <copy todir="${basedir}/dist/web/lib">
                <fileset dir="${basedir}/dist/java"/>
        </copy>
        <!-- copy all the cron tasks -->
        <echo message="Adding cron jobs..."/>
        <copy todir="${basedir}/dist/web/cron">
                <fileset dir="${basedir}/setup/cron/web"/>
        </copy>
        <echo message="Copying bashrc file"/>
        <copy file="${basedir}/setup/bashrc/web" tofile="${basedir}/dist/web/bashrc"/>
        <echo message="tar/gzipping deployment..."/>
        <tar basedir="${basedir}/dist/web" destfile="${basedir}/build/web.tar.gz" compression="gzip"/>
</target>

I’m sure that if you’ve made it this far you can guess how Subversion fits into the picture. All of the application code, scripts, config files, etc. get stored in the SVN repository. That means not only do I have a complete revision history for all of my code, but I also have that protection for my configuration files and my “glue” code, a handful of perl and bash scripts. One thing that I do, for posterity sake, is create a tag of my entire repository when I’m deploying a new version to S3. The tag name contains the revision number and deployment date so that I can see all the previously released versions at a glance and easily roll back to a stable version should we (god forbid) release some buggy code.

So, this is a glimpse at how I’ve been able to stay organized with my use of EC2 as a server infrastructure for a distributed web application. It’s really amazing how useful EC2 is, and I really think it will change the web hosting market drastically. I hope to share a bit more of my architecture with you, including some monitoring tools that I’ve developed and a few other fun things, but that will have to wait until next time.

Leave a Reply