Continuous Delivery in the Cloud: Deployment Automation (Part 5 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application. In part 2 I went over how we use this CD pipeline to deliver software from checkin to production. In part 3, we focused on how CloudFormation is used to script the virtual AWS components that create the Manatee infrastructure. Then in part 4, we focused on a “property file less” environment by dynamically setting and retrieving properties. A list of topics for each of the articles is summarized below:
Part 1: Introduction – Introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – In-depth look at the CD Pipeline;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – What you’re reading now;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)
In this part of the series, I am going to show how we use Capistrano to script our deployments to target environments.
What is Capistrano?
Capistrano is an open source Ruby tool used for deploying web applications. It automates deploying to one or more servers. These deployments can include procedures like placing a war on a target server, database changes, starting services, etc.
A Capistrano script has several major parts

Namespaces: Namespaces in Capistrano are used for differentiating tasks from other tasks with the same name. This is important if you create a library out of your Capistrano deployment configuration, you will want to make sure your tasks are unique. For instance a typical name for a task is setup. You need to make sure that your setup task does not potentially interfere with another user’s custom setup task. By using namespaces, you won’t have this conflict.
Tasks: Tasks are used for performing specific operations. An example task would be setup. Inside the setup task you will generally prepare the server for subsequent steps to execute successfully like deleting the current application.
Variables: Variables in Capistrano are defined as ruby symbols. These are set initially and then referenced later on in the script.
Order of execution: Capistrano allows you to define the order of deployment execution. You do this with Capistrano’s built in feature after. With after you define the order of task execution during your Capistrano deployment.
Templates: Templates are files that have injected ruby snippets. These are used for dynamically building configuration files.
Roles: Roles define what part each server in your infrastructure plays in the deployment. Typical roles consist of db, web and app. Roles are then referenced inside your tasks to determine which server the task is run against.

Since Capistrano is a ruby based tool, you can inject ruby methods and operations to enhance Capistrano. In our deployment we use ruby for returning property values from SimpleDB – as we discussed in part 4 of this series, Dynamic Configuration. This enables us to dynamically deploy to target servers.
How do you install Capistrano?
1. Capistrano is available as a rubygem. You simply type gem install capistrano on your Linux machine (assuming you have ruby and rubygems installed)
2. Type capify . this will create a Capfile which is a main file that Capistrano needs and a config/deploy.rb file (which is your actual Capistrano deployment script).
How do you run a Capistrano script?
You run Capistrano from the command line. From the same directory as your Capfile, type cap namespace:task. namespace and task being your own Namespace and Task defined in your deploy.rb script. This will start your Capistrano deployment.
Why do we use Capistrano?
We use Capistrano in order to have a fully scripted, versioned deployment. Every step in our application deployment is scripted and fully automated – which reduces errors when deploying. This gives us complete control over our deployment and the ability to deploy whenever we are ready.
Capistrano for Manatees
In the Manatee deployment, we use Capistrano for deploying our Manatee tracking application to our target environment. I am going to go through each part of the deploy.rb and explain its use and purpose. In a deployment’s lifecycle, the deployment is run as part of the CD pipeline – discussed in part 2 of the series, CD Pipeline. I’ll first go through a high level summary of the deployment and then dive into more detail in the next section.
1. Variables are set, which includes returning several properties from SimpleDB
2. Roles are set: db, web and app are all set to the ip_address variable configured dynamically in Step #1
3. The order of execution is set to run the tasks in order
4. Tasks are executed
5. If all tasks are executed successfully, Capistrano signals deployment success.
Now that we know at a high level what’s being done during the deployment, lets take a deeper look at the inside of the script. The actual script can be found here: deploy.rb

Variables

Command line set
stack – Passed into SimpleDB to return the dynamically set property values
ssh_key – Used by the ssh_options variable to SSH into the target environment
Dynamically set
domain – Used by the application variable
artifact_bucket - Used to build the artifact_url variable
ip_address – Used to define the IP address of the target environment to SSH into
dataSourceUsername – Returns a value that is part of the wildtracks_config.properties file
dataSourcePassword – Returns a value that is part of the wildtracks_config.properties file
dataStorageFtpUsername – Returns a value that is part of the wildtracks_config.properties file
dataStorageFtpPassword – Returns a value that is part of the wildtracks_config.properties file
Hardcoded
user – The user to SSH into the target box as
use_sudo – Define whether to prepend every command with sudo or not
deploy_to – Defines the deployment directory on the target environment
artifact – The artifact to deploy to the target server
artifact_url – The URL for downloading the artifact
ssh_options – Specialized SSH configuration
application – Used to set the domain that application runs on
liquibase_jar – Location of the liquibase.jar on the deployment server
postgres_jar – Location of the postgres.jar on the deployment server

Roles

Since the app server, web server, and database all co exist on the same environment, we set each of these to the same variable, ip_address.

Namspaces

Deploy: We use deploy as our namespace. Since we aren’t distributing this set of deployment tasks, we don’t need to make a unique namespace. In fact we could remove the namespace all-together, but we wanted to show it being used.
namespace :deploy

Execution

We define our execution order at the bottom of the script using after. This coordinates which task should be run during the deployment.
after "deploy:setup", "deploy:wildtracks_config" after "deploy:wildtracks_config", "deploy:httpd_conf" after "deploy:httpd_conf", "deploy:deploy" after "deploy:deploy", "deploy:liquibase" after "deploy:deploy", "deploy:restart"

Tasks

Setup: The setup task is our initial task. It makes sure the ownership of our deployment directory is set of tomcat. It then stops httpd and tomcat to get ready for the deployment.
task :setup do run "sudo chown -R tomcat:tomcat #{deploy_to}" run "sudo service httpd stop" run "sudo service tomcat6 stop" end

wildtracks_config: The wildtracks_config task is the second task to run. It dynamically creates the wildtracks-config.properties file using a template and the variables set previously in the script. It then places the wildtracks-config.properties file on the target environment.
task :wildtracks_config, :roles => :app do set :dataSourceUsername do item = sdb.domains["stacks"].items["wildtracks-config"] item.attributes['dataSourceUsername'].values[0].to_s.chomp end set :dataSourcePassword do item = sdb.domains["stacks"].items["wildtracks-config"] item.attributes['dataSourcePassword'].values[0].to_s.chomp end set :dataStorageFtpUsername do item = sdb.domains["stacks"].items["wildtracks-config"] item.attributes['dataStorageFtpUsername'].values[0].to_s.chomp end set :dataStorageFtpPassword do item = sdb.domains["stacks"].items["wildtracks-config"] item.attributes['dataStorageFtpPassword'].values[0].to_s.chomp end set :dataSourceUrl, "jdbc:postgresql://localhost:5432/manatees_wildtrack" set :dataStorageWorkDir, "/var/tmp/manatees_wildtracks_workdir" set :dataStorageFtpUrl, "ftp.wildtracks.org" set :databaseBackupScriptFile, "/usr/share/tomcat6/.sarvatix/manatees/wildtracks/database_backups/script/db_backup.sh" config_content = from_template("config/templates/wildtracks-config.properties.erb") put config_content, "/home/ec2-user/wildtracks-config.properties" run "sudo mv /home/ec2-user/wildtracks-config.properties /usr/share/tomcat6/.sarvatix/manatees/wildtracks/wildtracks-config.properties" run "sudo chown -R tomcat:tomcat /usr/share/tomcat6/.sarvatix/manatees/wildtracks/wildtracks-config.properties" run "sudo chmod 777 /usr/share/tomcat6/.sarvatix/manatees/wildtracks/wildtracks-config.properties" end

httpd_conf: The httpd_conf task is third on the stack and performs a similar function to the wildtracks_config task, but with the httpd.conf configuration file.
task :httpd_conf, :roles => :app do config_content = from_template("config/templates/httpd.conf.erb") put config_content, "/home/ec2-user/httpd.conf" run "sudo mv /home/ec2-user/httpd.conf /etc/httpd/conf/httpd.conf" end

Deploy: The deploy task is where the actual deployment of the application code is done. This task removes the current version of the application and downloads the latest.
task :deploy do run "cd #{deploy_to} && sudo rm -rf wildtracks* && sudo wget #{artifact_url}" end

Liquibase: The liquibase task sets up and ensures that the database is configured correctly.
task :liquibase, :roles => :db do db_username = fetch(:dataSourceUsername) db_password = fetch(:dataSourcePassword) private_ip_address = fetch(:private_ip_address) set :liquibase_jar, "/usr/share/tomcat6/.grails/1.3.7/projects/Build/plugins/liquibase-1.9.3.6/lib/liquibase-1.9.3.jar" set :postgres_jar, "/usr/share/tomcat6/.ivy2/cache/postgresql/postgresql/jars/postgresql-8.4-701.jdbc3.jar" system("cp -rf /usr/share/tomcat6/.jenkins/workspace/DeployManateeApplication/grails-app/migrations/* /usr/share/tomcat6/.jenkins/workspace/DeployManateeApplication/") system("java -jar #{liquibase_jar} --classpath=#{postgres_jar} --changeLogFile=changelog.xml --username=#{db_username} --password=#{db_password} --url=jdbc:postgresql://#{private_ip_address}:5432/manatees_wildtrack update") end

Restart: Lastly the restart task starts the httpd and tomcat services.
task :restart, :roles => :app do run "sudo service httpd restart" run "sudo service tomcat6 restart" end

Now that we’ve gone through the deployment, we need to test it. For testing our deployments, we use Cucumber. Cucumber enables us to do acceptance testing on our deployment. We verify that the application is up and available, the correct services are started and the property files are stored in the right locations.
Capistrano allows us to completely script and version our deployments enabling our deployments to be run at anytime. With Capistrano’s automation in conjunction with Cucumber’s acceptance testing, we are given a high level of confidence in our deployments and that when they are run, the application will be deployed successfully.
In the next and last part of our series – Infrastructure Automation – we’ll go through scripting environment using an industry standard infrastructure automation tool, Puppet.