03-02-2014

Creating a Secure Deployment Pipeline in Amazon Web Services

Many organizations require a secure infrastructure. I’ve yet to meet a customer that says that security isn’t a concern. But, the decision on “how secure?” should be closely associated with a risk analysis for your organization.

Since Amazon Web Services (AWS) is often referred to as a “public cloud”, people sometimes infer that “public” must mean it’s “out in the public” for all to see. I’ve always seen “public/private clouds” as an unfortunate use of terms. In this context, public means more like “Public Utility”. People often interpret “private clouds” to be inherently more secure. Assuming that “public cloud” = less secure and “private cloud” = more secure couldn’t be further from the truth. Like most things, it’s all about how you architect your infrastructure. While you can define your infrastructure to have open access, AWS provides many tools to create a truly secure infrastructure while eliminating access to all but only authorized users.

I’ve created an initial list of many of the practices we use. We don’t employ all these practices in all situations, as it often depends on our customers’ particular security requirements. But, if someone asked me “How do I create a secure AWS infrastructure using a Deployment Pipeline?”, I’d offer some of these practices in the solution. I’ll be expanding these over the next few weeks, but I want to start with some of our practices.

AWS Security

* After initial AWS account creation and login, configure IAM so that there’s no need to use the AWS root account
* Apply least privilege to all IAM accounts. Be very careful about who gets Administrator access.
* Enable all IAM password rules
* Enable MFA for all users
* Secure all data at rest
* Secure all data in transit
* Put all AWS resources in a Virtual Private Cloud (VPC).
* No EC2 Key Pairs should be shared with others. Same goes for Access Keys.
* Only open required ports to the Internet. For example, with the exception of, say, port 80, no security groups should have a CIDR Source of 0.0.0.0/0). The bastion host might have access to port 22 (SSH), but you should enable CIDR to limit access to specific subnets. Using a VPC is a part of a solution to eliminate Internet access. No canonical environments should have SSH/RDP access.
* Use IAM to limit access to specific AWS resources and/or remove/limit AWS console access
* Apply a bastion host configuration to reduce your attack profile
* Use IAM Roles so that there’s no need to configure Access Keys on the instances
* Use resource-level permissions in EC2 and RDS
* Use SSE to secure objects in S3 buckets
* Share initial IAM credentials with others through a secure mechanism (e.g. AES-256 encryption)
* Use and monitor AWS CloudTrail logs

Deployment Pipeline

A deployment pipeline is a staged process in which the complete software system is built and tested with every change. Team members receive feedback as it completes each stage. With most customers, we usually construct between 4-7 deployment pipeline stages and the pipeline only goes to the next stage if the previous stages were successful. If a stage fails, the whole pipeline instance fails. The first stage (often referred to as the “Commit Stage”) will usually take no more than 10 minutes to complete. Other stages may take longer than this. Most stages require no human intervention as the software system goes through more extensive testing on its way to production. With a deployment pipeline, software systems can be released at any time the business chooses to do so. Here are some of the security-based practices we employ in constructing a deployment pipeline.

* Automate everything: Networking (VPC, Route 53) Compute (EC2), Storage, etc. All AWS automation should be defined in CloudFormation. All environment configuration should be defined using infrastructure automation scripts – such as Chef, Puppet, etc.
* Version Everything: Application Code, Configuration, Infrastructure and Data
* Manage your binary dependencies. Be specific about binary version numbers. Ensure you have control over these binaries.
* Lockdown pipeline environments. Do not allow SSH/RDP access to any environment in the deployment pipeline
* For project that require it, use permissions on the CI server or Deployment application to limit who can run deployments in certain environments – such as QA, Pre-Production and Production. When you have a policy in which all changes are applied through automation and environments are locked down, this usually becomes less of a concern. But, it can still be a requirements on some teams.
* Use the Disposable Environments pattern – instances are terminated once every few days. This approach reduces the attack profile
* Log everything outside of the EC2 instances (so that they can be access later). Ensure these log files are encrypted e.g. securely through S3)
* All canonical changes are only applied through automation that are part of the deployment pipeline. This includes application, configuration, infrastructure and data change. Infrastructure patch management would be a part of the pipeline just like any outer software system change.
* No one has access to nor can make direct changes to pipeline environments
* Create high-availability systems Multi-AZ, Auto Scaling, Elastic Load Balancing and Route 53
* For non-Admin AWS users, only provide access to AWS through a secure Continuous Integration (CI) server or a self-service application
* Use Self-Service Deployments and give developers full SSH/RDP access to their self-service deployment. Only their particular EC2 Key Pair can access the instance(s) associated with the deployment. Self-Service Deployments can be defined in the CI server or a lightweight self-service application.
* Provide capability for any authorized user to perform a self-service deployment with full SSH/RDP access to the environment they created (while eliminating outside access)
* Run two active environments – We’ve yet to do this for customers, but if you want to eliminate all access to the canonical production environment, you might choose to run two active environments at once so that engineers can access the non-production environment to troubleshoot a problem in which the environment has the exact same configuration and data so you’re troubleshooting accurately.
* Run automated infrastructure tests to test for security vulnerabilities (e.g. cross-site scripting, SQL injections, etc.) with every change committed to the version-control repository as part of the deployment pipeline.

FAQ

* What is a canonical environment? It’s your system of record. You want your canonical environment to be solely defined in source code and versioned. If someone makes a change to the canonical system and it affects everyone it should only be done through automation. While you can use a self-service deployment to get a copy of the canonical system, any direct change you make to the environment is isolated and never made part of the canonical system unless code is committed to the version-control repository.
* How can I troubleshoot if I cannot directly access canonical environments? Using a self-service deployment, you can usually determine the cause of the problem. If it’s a data-specific problem, you might import a copy of the production database. If this isn’t possible for time or security reasons, you might run multiple versions of the application at once.
* Why should we dispose of environments regularly? Two primary reasons. The first is to reduce your attack profile (i.e. if environments always go up and down, it’s more difficult to hone in on specific resources. The second reason is that it ensures that all team members are used to applying all canonical changes through automation and not relying on environments to always be up and running somewhere.
* Why should we lockdown environments? To prevent people from making disruptive environment changes that don’t go through the version-control repository.

02-03-2014

How we use AWS OpsWorks

Amazon Web Services (AWS) OpsWorks was released one year ago this month. In the past year, we’ve used OpsWorks on several Cloud Delivery projects at Stelligent and at some of our customers. This article describes what’s worked for us and our customers. One of our core aims with any customer is to create a fully repeatable process for delivering software. To us, this translates into several more specific objectives. For each process we automate, the process must be fully documented, tested, scripted, versioned and continuous. This article describes how we achieved each of these five objectives in delivering OpsWorks solutions to our customers. In creating any solution, we version any and every asset required to create the software system. With the exception of certain binary packages, the entire software system gets described in code. This includes the application code, configuration, infrastructure and data.

As a note, we’ve developed other AWS solutions without OpsWorks using CloudFormation, Chef, Puppet and some of the other tools mentioned here, but the purpose of this is to describe our approach when using OpsWorks.

AWS Tools

AWS has over 30 services and we use a majority of these services when creating deployment pipelines for continuous delivery and automating infrastructure. However, we typically use only a few services directly when building these infrastructure. For instance, when creating infrastructure with OpsWorks, we’ll use the AWS Ruby SDK to provision the OpsWorks resources and CloudFormation for the resources we cannot provision through OpsWorks. We use these three services to access services such as EC2, Route 53, VPC, S3, Elastic Load Balancing, Auto Scaling, etc. These three services are described below.


AWS OpsWorks
– OpsWorks is an infrastructure orchestration and event modeling service for provisioning infrastructure resources. It also enables you to call out to Chef cookbooks (more on Chef later). The OpsWorks model logically defines infrastructure in terms of stacks, layers and apps. Within stacks, you can define layers; within layers you can define applications and within applications, you can run deployments. An event model automatically triggers events against these stacks (e.g. Setup, Configure, Deploy, Undeploy, Shutdown). As mentioned, we use the AWS API (through the Ruby SDK) to script the provisioning of all OpsWorks behavior. We never manually make changes to OpsWorks through the console (we make these changes to the versioned AWS API scripts).

CloudFormation – We use CloudFormation to automatically provision resources that we cannot provision directly through OpsWorks. For example, while OpsWorks connects with Virtual Private Clouds (VPC)s and Elastic Load Balancer (ELB)s, you cannot provision VPC or ELB directly through OpsWorks. Since we choose to script all infrastructure provisioning and workflow, we wrote CloudFormation templates for defining VPCs, ELBs, Relational Database Service (RDS) and Elasticache. We orchestrate the workflow in Jenkins so that these resources are automatically provisioned prior to provisioning the OpsWorks stacks. This way, the OpsWorks stacks can consume these resources that were provisioned in the CloudFormation templates. As with any other program, these templates are version-controlled.

AWS API (using Ruby SDK) – We use the AWS Ruby SDK to script the provisioning of OpsWorks stacks. While we avoid using the SDK directly for most other AWS services (because we can use CloudFormation), we chose to use the SDK for scripting OpsWorks because CloudFormation does not currently support OpsWorks. Everything that you might do using the OpsWorks dashboard – creating stacks, JSON configuration, calling out to Chef, deployments – are all written in Ruby programs that utilize the OpsWorks portion of the AWS API.

Infrastructure Automation

There are other non-AWS specific tools that we use in automating infrastructure. One of them is the infrastructure automation tool, Chef. Chef Solo is called from OpsWorks. We use infrastructure automation tools to script and as a way to document the process of provisioning infrastructure.

Chef – OpsWorks is designed to run Chef cookbooks (i.e. scripts/programs). Ultimately, Chef is where a bulk of the behavior for provisioning environments is defined – particularly once the EC2 instance is up and running. In Chef, we write recipes (logically stored in cookbooks) to install and configure web servers such as Apache and Nginx or application servers such as Rails and Tomcat. All of these Chef recipes are version-controlled and called from OpsWorks or CloudFormation.

Ubuntu – When using OpsWorks and there’s no specific operating system flavor requirement from our customer, we choose to use Ubuntu 12.04 LTS. We do this for two reasons. The first is that at the time of this writing, OpsWorks supports two Linux flavors: Amazon Linux and Ubuntu 12.04 LTS. The reason we choose Ubuntu is because it allows us to use Vagrant (more on Vagrant later). Vagrant provides us a way to test our Chef infrastructure automation scripts locally – increasing our infrastructure development speed.

Supporting Tools

Other supporting tools such as Jenkins, Vagrant and Cucumber help with Continuous Integration, local infrastructure development and testing. Each are described below.

JenkinsJenkins is a Continuous Integration server, but we also use it to orchestrate the coarse-grained workflow for the Cloud Delivery system and infrastructure for our customers. We use Jenkins fairly regularly in creating Cloud Delivery solutions for our customers. We configure Jenkins to run Cucumber features, build scripts, automated tests, static analysis, AWS Ruby SDK programs, CloudFormation templates and many more activities. Since Jenkins is an infrastructure component as well, we’ve automated the creation in OpsWorks and Chef and it also runs Cucumber features that we’ve written. These scripts and configuration are stored in Git as well and we can simply type a single command to get the Jenkins environment up and running. Any canonical changes to the Jenkins server are made by modifying the programs or configuration stored in Git.

VagrantVagrant runs a virtualized environment on your desktop and comes with support for certain OS flavors and environments. As mentioned, we use Vagrant to run and test our infrastructure automation scripts locally to increase the speed of development. In many cases, what might take 30-40 minutes to run the same Chef cookbooks can take 4-5 minutes to run locally in Vagrant – significantly increase our infrastructure development productivity.

Cucumber – We use Cucumber to write infrastructure specifications in code called features. This provides executable documented specifications that get run with each Jenkins build. Before we write any Chef, OpsWorks or CloudFormation code, we write Cucumber features. When completed, these features are run automatically after the Chef, OpsWorks and/or CloudFormation scripts provision the infrastructure to ensure the infrastructure is meeting the specifications described in the features. At first, these features are written without step definitions (i.e. they don’t actually verify behavior against the infrastructure), but then we iterate through a process of writing programs to automate the infrastructure provisioning while adding step definitions and refining the Cucumber features. Once all of this is hooked up to the Jenkins Continuous Integration server, it provisions the infrastructure and then runs the infrastructure tests/features written in Cucumber. Just like writing XUnit tests for the application code, this approach ensures our infrastructure behaves as designed and provides a set of regression tests that are run with every change to any part of the software system. So, Cucumber helps us document the feature as well as automate infrastructure tests. We also write usage and architecture documentation in READMEs, wikis, etc.

10-08-2012

Continuous Delivery in the Cloud: Infrastructure Automation (Part 6 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application. In part 2, I went over how we use this CD pipeline to deliver software from checkin to production. In part 3, we focused on how CloudFormation is used to script the virtual AWS components that create the Manatee infrastructure. Then in part 4, we focused on a “property file less” environment by dynamically setting and retrieving properties. Part 5 explained how we use Capistrano for scripting our deployment. A list of topics for each of the articles is summarized below:

Part 1: Introduction – Introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – In-depth look at the CD Pipeline;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – What you’re reading now;

In this part of the series, I am going to show how we use Puppet in combination with CloudFormation to script our target environment infrastructure, preparing it for a Manatee application deployment.

What is Puppet?

Puppet is a Ruby based infrastructure automation tool. Puppet is primarily used for provisioning environments and managing configuration. Puppet is made to support multiple operating systems, making your infrastructure automation cross-platform.

How does Puppet work?

Puppet uses a library called Facter which collects facts about your system. Facter returns details such as the operating system, architecture, IP address, etc. Puppet uses these facts to make decisions for provisioning your environment. Below is an example of the facts returned by Facter.

# Facter
architecture => i386
...
ipaddress => 172.16.182.129
is_virtual => true
kernel => Linux
kernelmajversion => 2.6
...
operatingsystem => CentOS
operatingsystemrelease => 5.5
physicalprocessorcount => 0
processor0 => Intel(R) Core(TM)2 Duo CPU     P8800  @ 2.66GHz
processorcount => 1
productname => VMware Virtual Platform

Puppet uses the operating system fact to decide the service name as show below:


case $operatingsystem {
  centos, redhat: {
    $service_name = 'ntpd'
    $conf_file    = 'ntp.conf.el'
  }
}

With this case statement, if the operating environment is either centos or redhat the service name ntpd and the configuration file ntp.conf.el are used.

Puppet is declarative by nature. Inside a Puppet module you define the end state the environment end state after the Puppet run. Puppet enforces this state during the run. If at any point the environment does not conform to the desired state, the Puppet run fails.

Anatomy of a Puppet Module

To script the infrastructure Puppet uses modules for organizing related code to perform a specific task. A Puppet module has multiple sub directories that contain resources for performing the intended task. Below are these resources:

manifests/: Contains the manifest class files for defining how to perform the intended task
files/: Contains static files that the node can download during the installation
lib/: Contains plugins
templates/: Contains templates which can be used by the module’s manifests
tests/: Contains tests for the module

Puppet also uses manifests to manage multiple modules together site.pp. Puppet also uses another manifest to define what to install on each node, default.pp.

How to run Puppet

Puppet can be run using either a master agent configuration or a solo installation (puppet apply).

Master Agent: With a master agent installation, you configure one main master puppet node which manages and configure all of your agent nodes (target environments). The master initiates the installation of the agent and manages it throughout its lifecycle. This model enables infrastructure changes to your agents in parallel by controlling the master node.

Solo: In a solo Puppet run, it’s up to the user to place the desired Puppet module on the target environment. Once the module is on the target environment, the user needs run puppet apply --modulepath=/path/to/modules/ /path/to/site.pp. Puppet will then provision the server with the provided modules and site.pp without relying on another node.

Why do we use Puppet?

We use Puppet to script and automate our infrastructure — making our environment provisioning repeatable, fully automated, and less error prone. Furthermore, scripting our environments gives us complete control over our infrastructure and the ability to terminate and recreate environments as often as they choose.

Puppet for Manatees

In the Manatee infrastructure, we use Puppet for provisioning our target environments. I am going to go through our manifests and modules while explaining their use and purpose. In our Manatee infrastructure, we create a new target environment as part of the CD pipeline – discussed in part 2 of the series, CD Pipeline. Below I provide a high-level summary of the environment provisioning process:

1. CloudFormation dynamically creates a params.pp manifest with AWS variables
2. CloudFormation runs puppet apply as part of UserData
3. Puppet runs the modules defined in hosts/default.pp.
4. Cucumber acceptance tests are run to verify the infrastructure was provisioned correctly.

Now that we know at a high-level what’s being done during the environment provisioning, let’s take a deeper look at the scripts in more detail. The actual scripts can be found here: Puppet

First we will start off with the manifests.

The site.pp (shown below) serves two purposes. It loads the other manifests default.pp, params.pp and also sets stages pre, main and post.


import "hosts/*"
import "classes/*"

stage { [pre, post]: }
Stage[pre] -> Stage[main] -> Stage[post]

These stages are used to define the order in which Puppet modules should be run. If the Puppet module is defined as pre,it will run before Puppet modules defined as main or post. Moreover if stages aren’t defined, Puppet will determine the order of execution. The default.pp (referenced below) shows how staging defined for executing puppet modules.


node default {
  class { "params": stage => pre }
  class { "java": stage => pre }
  class { "system": stage => pre }
  class { "tomcat6": stage => main }
  class { "postgresql": stage => main }
  class { "subversion": stage => main }
  class { "httpd": stage => main }
  class { "groovy": stage => main }
}

The default.pp manifest also defines which Puppet modules to use for provisioning the target environment.

params.pp (shown below), loaded from site.pp, is dynamically created using CloudFormation. params.pp is used for setting AWS property values that are used later in the Puppet modules.


class params {
  $s3_bucket = ''
  $application_name = ''
  $hosted_zone = ''
  $access_key = ''
  $secret_access_key = ''
  $jenkins_internal_ip = ''
}

Now that we have an overview of the manifests used, lets take a look at the Puppet modules themselves.

In our java module, which is run in the pre stage, we are running a simple installation using packages. This is easily dealt with in Puppet by using the package resource. This relies on Puppet’s knowledge of the operating system and the package manager. Puppet simply installs the package that is declared.


class java {
  package { "java-1.6.0-openjdk": ensure => "installed" }
}

The next module we’ll discuss is system. System is also run during the pre stage and is used for the setup of all the extra operations that don’t necessarily need their own module. These actions include setting up general packages (gcc, make, etc.), installing ruby gems (AWS sdk, bundler, etc.), and downloading custom scripts used on the target environment.


class system {

  include params

  $access_key = $params::access_key
  $secret_access_key = $params::secret_access_key

  Exec { path => '/usr/bin:/bin:/usr/sbin:/sbin' }

  package { "gcc": ensure => "installed" }
  package { "mod_proxy_html": ensure => "installed" }
  package { "perl": ensure => "installed" }
  package { "libxslt-devel": ensure => "installed" }
  package { "libxml2-devel": ensure => "installed" }
  package { "make": ensure => "installed" }

  package {"bundler":
    ensure => "1.1.4",
    provider => gem
  }

  package {"trollop":
    ensure => "2.0",
    provider => gem
  }

  package {"aws-sdk":
    ensure => "1.5.6 ",
    provider => gem,
    require => [
      Package["gcc"],
      Package["make"]
    ]
  }

  file { "/home/ec2-user/aws.config":
    content => template("system/aws.config.erb"),
    owner => 'ec2-user',
    group => 'ec2-user',
    mode => '500',
  }

  define download_file($site="",$cwd="",$creates=""){
    exec { $name:
      command => "wget ${site}/${name}",
      cwd => $cwd,
      creates => "${cwd}/${name}"
    }
  }

  download_file {"database_update.rb":
    site => "https://s3.amazonaws.com/sea2shore",
    cwd => "/home/ec2-user",
    creates => "/home/ec2-user/database_update.rb",
  }

  download_file {"id_rsa.pub":
    site => "https://s3.amazonaws.com/sea2shore/private",
    cwd => "/tmp",
    creates => "/tmp/id_rsa.pub"
  }

  exec {"authorized_keys":
    command => "cat /tmp/id_rsa.pub >> /home/ec2-user/.ssh/authorized_keys",
    require => Download_file["id_rsa.pub"]
    }
  }

First I want to point out that at the top we are specifying to include params. This enables the system module to access the params.pp file. This way we can use the properties defined in params.pp.


include params

$access_key = $params::access_key
$secret_access_key = $params::secret_access_key

This enables us to define the parameters in one central location and then reference that location with other module.

As we move through the script we are using the package resource similar to previous modules. For each rubygem we use the package resource and explicitly tell Puppet to use the gem provider. You can specify other providers like rpm and yum.

We use the file resource to create files from templates.


AWS.config(
  :access_key_id => "<%= "#{access_key}" %>",
  :secret_access_key => "<%= "#{secret_access_key}" %>"
)

In the aws.config.erb template (referenced above) we are using the properties defined in params.pp for dynamically creating an aws.config credential file. This file is then used by our database_update.rb script for connecting to S3.

Speaking of the database_update.rb script, we need to get it on the target environment. To do this, we define a download_file resource.


define download_file($site="",$cwd="",$creates=""){
  exec { $name:
    command => "wget ${site}/${name}",
    cwd => $cwd,
    creates => "${cwd}/${name}"
  }
}

This creates a new resource for Puppet to use. Using this we are able to download both the database_update.rb and id_rsa.pub public SSH key.

As a final step for setting up the system, we execute a bash line for copying the id_rsa.pub contents into the authorized_keys file for the ec2-user. This enables clients with the connected id_rsa key to ssh into the target environment as ec2-user.

The Manatee infrastructure uses Apache for the webserver, Tomcat for the app server, and PostgreSQL for its database. Puppet these up as part of the main stage, meaning they run in order after the pre stage modules are run.

In our httpd module, we are performing several steps discussed previously. The httpd package is installed and creating a new file from a template.


class httpd {
  include params

  $application_name = $params::application_name
  $hosted_zone = $params::hosted_zone

  package { 'httpd':
    ensure => installed,
  }

  file { "/etc/httpd/conf/httpd.conf":
    content => template("httpd/httpd.conf.erb"),
    require => Package["httpd"],
    owner => 'ec2-user',
    group => 'ec2-user',
    mode => '664',
  }

  service { 'httpd':
    ensure => running,
    enable => true,
    require => [
      Package["httpd"],
      File["/etc/httpd/conf/httpd.conf"]],
      subscribe => Package['httpd'],
    }
  }

The new piece of functionality used in our httpd module is service. service allows us define the state the httpd service should be in at the end of our run. In this case, we are declaring that it should be running.

The Tomcat module again uses package to define what to install and service to declare the end state of the tomcat service.


class tomcat6 {

  Exec { path => '/usr/bin:/bin:/usr/sbin:/sbin' }

  package { "tomcat6":
    ensure => "installed"
  }

  $backup_directories = [
    "/usr/share/tomcat6/.sarvatix/",
    "/usr/share/tomcat6/.sarvatix/manatees/",
    "/usr/share/tomcat6/.sarvatix/manatees/wildtracks/",
    "/usr/share/tomcat6/.sarvatix/manatees/wildtracks/database_backups/",
    "/usr/share/tomcat6/.sarvatix/manatees/wildtracks/database_backups/backup_archive",
  ]

  file { $backup_directories:
    ensure => "directory",
    owner => "tomcat",
    group => "tomcat",
    mode => 777,
    require => Package["tomcat6"],
  }

  service { "tomcat6":
    enable => true,
    require => [
      File[$backup_directories],
      Package["tomcat6"]],
    ensure => running,
  }
}

Tomcat uses the file resource differently then previous modules. tomcat uses file for creating directories. This is defined using ensure => “directory”.

We are using the package resource for installing PostgreSQL, building files from templates using the file resource, performing bash executions with exec, and declaring the intended state of the PostgreSQL using the service resource.


class postgresql {

  include params

  $jenkins_internal_ip = $params::jenkins_internal_ip

  Exec { path => '/usr/bin:/bin:/usr/sbin:/sbin' }

  define download_file($site="",$cwd="",$creates=""){
    exec { $name:
      command => "wget ${site}/${name}",
      cwd => $cwd,
      creates => "${cwd}/${name}"
    }
  }

  download_file {"wildtracks.sql":
    site => "https://s3.amazonaws.com/sea2shore",
    cwd => "/tmp",
    creates => "/tmp/wildtracks.sql"
  }

  download_file {"createDbAndOwner.sql":
    site => "https://s3.amazonaws.com/sea2shore",
    cwd => "/tmp",
    creates => "/tmp/createDbAndOwner.sql"
  }

  package { "postgresql8-server":
    ensure => installed,
  }

  exec { "initdb":
    command => "service postgresql initdb",
    require => Package["postgresql8-server"]
  }

  file { "/var/lib/pgsql/data/pg_hba.conf":
    content => template("postgresql/pg_hba.conf.erb"),
    require => Exec["initdb"],
    owner => 'postgres',
    group => 'postgres',
    mode => '600',
  }

  file { "/var/lib/pgsql/data/postgresql.conf":
    content => template("postgresql/postgresql.conf.erb"),
    require => Exec["initdb"],
    owner => 'postgres',
    group => 'postgres',
    mode => '600',
  }

  service { "postgresql":
    enable => true,
    require => [
      Exec["initdb"],
      File["/var/lib/pgsql/data/postgresql.conf"],
      File["/var/lib/pgsql/data/pg_hba.conf"]],
    ensure => running,
  }

  exec { "create-user":
    command => "echo CREATE USER root | psql -U postgres",
    require => Service["postgresql"]
  }

  exec { "create-db-owner":
    require => [
      Download_file["createDbAndOwner.sql"],
      Exec["create-user"],
      Service["postgresql"]],
    command => "psql < /tmp/createDbAndOwner.sql -U postgres"
  }

  exec { "load-database":
    require => [
      Download_file["wildtracks.sql"],
      Exec["create-user"],
      Service["postgresql"],
      Exec["create-db-owner"]],
    command => "psql -U manatee_user -d manatees_wildtrack -f /tmp/wildtracks.sql"
  }
}

In this module we are creating a new user on the PostgreSQL database:


exec { "create-user":
  command => "echo CREATE USER root | psql -U postgres",
  require => Service["postgresql"]
}

In this next section we download the latest Manatee database SQL dump.


download_file {"wildtracks.sql":
  site => "https://s3.amazonaws.com/sea2shore",
  cwd => "/tmp",
  creates => "/tmp/wildtracks.sql"
}

In the section below, we load the database with the SQL file. This builds our target environments with the production database content giving developers an exact replica sandbox to work in.

exec { "load-database":
  require => [
    Download_file["wildtracks.sql"],
    Exec["create-user"],
    Service["postgresql"],
    Exec["create-db-owner"]],
  command => "psql -U manatee_user -d manatees_wildtrack -f /tmp/wildtracks.sql"
  }
}

Lastly in our Puppet run, we install subversion and groovy on the target node. We could have just included these in our system module, but they seemed general purpose enough to create individual modules.

Subversion manifest:

class subversion {
  package { "subversion":
    ensure => "installed"
  }
}

Groovy manifest:

class groovy {
  Exec { path => '/usr/bin:/bin:/usr/sbin:/sbin' }

  define download_file($site="",$cwd="",$creates=""){
    exec { $name:
    command => "wget ${site}/${name}",
    cwd => $cwd,
    creates => "${cwd}/${name}"
    }
  }

  download_file {"groovy-1.8.2.tar.gz":
    site => "https://s3.amazonaws.com/sea2shore/resources/binaries",
    cwd => "/tmp",
    creates => "/tmp/groovy-1.8.2.tar.gz",
  }

  file { "/usr/bin/groovy-1.8.2/":
    ensure => "directory",
    owner => "root",
    group => "root",
    mode => 755,
    require => Download_file["groovy-1.8.2.tar.gz"],
  }

  exec { "extract-groovy":
    command => "tar -C /usr/bin/groovy-1.8.2/ -xvf /tmp/groovy-1.8.2.tar.gz",
    require => File["/usr/bin/groovy-1.8.2/"],
  }
}

The Subversion manifest is relatively straightforward as we are using the package resource. The Groovy manifest is slightly different, we are downloading the Groovy tar, placing it on the filesystem, and then extracting it.

We’ve gone through how the target environment is provisioned. We do however have one more task, testing. It’s not enough to assume that if Puppet doesn’t error out, that everything got installed successfully. For this reason, we use Cucumber to do acceptance testing against our environment. Our tests check if services are running, configuration files are present and if the right packages have been installed.

Puppet allows us to completely script and version our target environments. Consequently, this enables us to treat environments as disposable entities. As a practice, we create a new target environment every time our CD pipeline is run. This way we are always deploying against a known state.

As our blog series is coming to a close, let’s recap what we’ve gone through. In the Manatee infrastructure we use a combination of CloudFormation for scripting AWS resources, Puppet for scripting target environments, Capistrano for deployment automation, Simple DB and CloudFormation for dynamic properties and
Jenkins for coordinating all the resources into one cohesive unit for moving a Manatee application change from check-in to production in just a single click.

10-04-2012

Continuous Delivery in the Cloud: Deployment Automation (Part 5 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application. In part 2 I went over how we use this CD pipeline to deliver software from checkin to production. In part 3, we focused on how CloudFormation is used to script the virtual AWS components that create the Manatee infrastructure. Then in part 4, we focused on a “property file less” environment by dynamically setting and retrieving properties. A list of topics for each of the articles is summarized below:
Part 1: Introduction – Introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – In-depth look at the CD Pipeline;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – What you’re reading now;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

In this part of the series, I am going to show how we use Capistrano to script our deployments to target environments.

What is Capistrano?
Capistrano is an open source Ruby tool used for deploying web applications. It automates deploying to one or more servers. These deployments can include procedures like placing a war on a target server, database changes, starting services, etc.

A Capistrano script has several major parts

  • Namespaces: Namespaces in Capistrano are used for differentiating tasks from other tasks with the same name. This is important if you create a library out of your Capistrano deployment configuration, you will want to make sure your tasks are unique. For instance a typical name for a task is setup. You need to make sure that your setup task does not potentially interfere with another user’s custom setup task. By using namespaces, you won’t have this conflict.
  • Tasks: Tasks are used for performing specific operations. An example task would be setup. Inside the setup task you will generally prepare the server for subsequent steps to execute successfully like deleting the current application.
  • Variables: Variables in Capistrano are defined as ruby symbols. These are set initially and then referenced later on in the script.
  • Order of execution: Capistrano allows you to define the order of deployment execution. You do this with Capistrano’s built in feature after. With after you define the order of task execution during your Capistrano deployment.
  • Templates: Templates are files that have injected ruby snippets. These are used for dynamically building configuration files.
  • Roles: Roles define what part each server in your infrastructure plays in the deployment. Typical roles consist of db, web and app. Roles are then referenced inside your tasks to determine which server the task is run against.

Since Capistrano is a ruby based tool, you can inject ruby methods and operations to enhance Capistrano. In our deployment we use ruby for returning property values from SimpleDB – as we discussed in part 4 of this series, Dynamic Configuration. This enables us to dynamically deploy to target servers.

How do you install Capistrano?

1. Capistrano is available as a rubygem. You simply type gem install capistrano on your Linux machine (assuming you have ruby and rubygems installed)
2. Type capify . this will create a Capfile which is a main file that Capistrano needs and a config/deploy.rb file (which is your actual Capistrano deployment script).

How do you run a Capistrano script?
You run Capistrano from the command line. From the same directory as your Capfile, type cap namespace:task. namespace and task being your own Namespace and Task defined in your deploy.rb script. This will start your Capistrano deployment.

Why do we use Capistrano?
We use Capistrano in order to have a fully scripted, versioned deployment. Every step in our application deployment is scripted and fully automated – which reduces errors when deploying. This gives us complete control over our deployment and the ability to deploy whenever we are ready.

Capistrano for Manatees
In the Manatee deployment, we use Capistrano for deploying our Manatee tracking application to our target environment. I am going to go through each part of the deploy.rb and explain its use and purpose. In a deployment’s lifecycle, the deployment is run as part of the CD pipeline – discussed in part 2 of the series, CD Pipeline. I’ll first go through a high level summary of the deployment and then dive into more detail in the next section.

1. Variables are set, which includes returning several properties from SimpleDB
2. Roles are set: db, web and app are all set to the ip_address variable configured dynamically in Step #1
3. The order of execution is set to run the tasks in order
4. Tasks are executed
5. If all tasks are executed successfully, Capistrano signals deployment success.

Now that we know at a high level what’s being done during the deployment, lets take a deeper look at the inside of the script. The actual script can be found here: deploy.rb

Variables

Command line set
stack – Passed into SimpleDB to return the dynamically set property values
ssh_key – Used by the ssh_options variable to SSH into the target environment

Dynamically set
domain – Used by the application variable
artifact_bucket - Used to build the artifact_url variable
ip_address – Used to define the IP address of the target environment to SSH into
dataSourceUsername – Returns a value that is part of the wildtracks_config.properties file
dataSourcePassword – Returns a value that is part of the wildtracks_config.properties file
dataStorageFtpUsername – Returns a value that is part of the wildtracks_config.properties file
dataStorageFtpPassword – Returns a value that is part of the wildtracks_config.properties file

Hardcoded
user – The user to SSH into the target box as
use_sudo – Define whether to prepend every command with sudo or not
deploy_to – Defines the deployment directory on the target environment
artifact – The artifact to deploy to the target server
artifact_url – The URL for downloading the artifact
ssh_options – Specialized SSH configuration
application –  Used to set the domain that application runs on
liquibase_jar – Location of the liquibase.jar on the deployment server
postgres_jar – Location of the postgres.jar on the deployment server

Roles

Since the app server, web server, and database all co exist on the same environment, we set each of these to the same variable, ip_address.

Namspaces

Deploy: We use deploy as our namespace. Since we aren’t distributing this set of deployment tasks, we don’t need to make a unique namespace. In fact we could remove the namespace all-together, but we wanted to show it being used.

namespace :deploy

Execution

We define our execution order at the bottom of the script using after. This coordinates which task should be run during the deployment.

after "deploy:setup", "deploy:wildtracks_config"
after "deploy:wildtracks_config", "deploy:httpd_conf"
after "deploy:httpd_conf", "deploy:deploy"
after "deploy:deploy", "deploy:liquibase"
after "deploy:deploy", "deploy:restart"

Tasks

  • Setup: The setup task is our initial task. It makes sure the ownership of our deployment directory is set of tomcat. It then stops httpd and tomcat to get ready for the deployment.

    task :setup do
      run "sudo chown -R tomcat:tomcat #{deploy_to}"
      run "sudo service httpd stop"
      run "sudo service tomcat6 stop"
    end
  • wildtracks_config: The wildtracks_config task is the second task to run. It dynamically creates the wildtracks-config.properties file using a template and the variables set previously in the script. It then places the wildtracks-config.properties file on the target environment.

    task :wildtracks_config, :roles => :app do

      set :dataSourceUsername do
        item = sdb.domains["stacks"].items["wildtracks-config"]
        item.attributes['dataSourceUsername'].values[0].to_s.chomp
      end
      set :dataSourcePassword do
        item = sdb.domains["stacks"].items["wildtracks-config"]
        item.attributes['dataSourcePassword'].values[0].to_s.chomp
      end
      set :dataStorageFtpUsername do
        item = sdb.domains["stacks"].items["wildtracks-config"]
        item.attributes['dataStorageFtpUsername'].values[0].to_s.chomp
      end
      set :dataStorageFtpPassword do
        item = sdb.domains["stacks"].items["wildtracks-config"]
        item.attributes['dataStorageFtpPassword'].values[0].to_s.chomp
      end

      set :dataSourceUrl, "jdbc:postgresql://localhost:5432/manatees_wildtrack"
      set :dataStorageWorkDir, "/var/tmp/manatees_wildtracks_workdir"
      set :dataStorageFtpUrl, "ftp.wildtracks.org"
      set :databaseBackupScriptFile, "/usr/share/tomcat6/.sarvatix/manatees/wildtracks/database_backups/script/db_backup.sh"

      config_content = from_template("config/templates/wildtracks-config.properties.erb")
      put config_content, "/home/ec2-user/wildtracks-config.properties"

      run "sudo mv /home/ec2-user/wildtracks-config.properties /usr/share/tomcat6/.sarvatix/manatees/wildtracks/wildtracks-config.properties"
      run "sudo chown -R tomcat:tomcat /usr/share/tomcat6/.sarvatix/manatees/wildtracks/wildtracks-config.properties"
      run "sudo chmod 777 /usr/share/tomcat6/.sarvatix/manatees/wildtracks/wildtracks-config.properties"
    end

  • httpd_conf: The httpd_conf task is third on the stack and performs a similar function to the wildtracks_config task, but with the httpd.conf configuration file.

    task :httpd_conf, :roles => :app do

      config_content = from_template("config/templates/httpd.conf.erb")
      put config_content, "/home/ec2-user/httpd.conf"

      run "sudo mv /home/ec2-user/httpd.conf /etc/httpd/conf/httpd.conf"
    end

  • Deploy: The deploy task is where the actual deployment of the application code is done. This task removes the current version of the application and downloads the latest.

    task :deploy do
      run "cd #{deploy_to} && sudo rm -rf wildtracks* && sudo wget #{artifact_url}"
    end
  • Liquibase: The liquibase task sets up and ensures that the database is configured correctly.

    task :liquibase, :roles => :db do

      db_username = fetch(:dataSourceUsername)
      db_password = fetch(:dataSourcePassword)
      private_ip_address = fetch(:private_ip_address)

      set :liquibase_jar, "/usr/share/tomcat6/.grails/1.3.7/projects/Build/plugins/liquibase-1.9.3.6/lib/liquibase-1.9.3.jar"
      set :postgres_jar, "/usr/share/tomcat6/.ivy2/cache/postgresql/postgresql/jars/postgresql-8.4-701.jdbc3.jar"

      system("cp -rf /usr/share/tomcat6/.jenkins/workspace/DeployManateeApplication/grails-app/migrations/* /usr/share/tomcat6/.jenkins/workspace/DeployManateeApplication/")

      system("java -jar #{liquibase_jar}\
                    --classpath=#{postgres_jar}\
                    --changeLogFile=changelog.xml\
                    --username=#{db_username}\
                    --password=#{db_password}\
                    --url=jdbc:postgresql://#{private_ip_address}:5432/manatees_wildtrack\
    update")
    end

  • Restart: Lastly the restart task starts the httpd and tomcat services.

    task :restart, :roles => :app do
      run "sudo service httpd restart"
      run "sudo service tomcat6 restart"
    end

Now that we’ve gone through the deployment, we need to test it. For testing our deployments, we use Cucumber. Cucumber enables us to do acceptance testing on our deployment. We verify that the application is up and available, the correct services are started and the property files are stored in the right locations.

Capistrano allows us to completely script and version our deployments enabling our deployments to be run at anytime. With Capistrano’s automation in conjunction with Cucumber’s acceptance testing, we are given a high level of confidence in our deployments and that when they are run, the application will be deployed successfully.

In the next and last part of our series – Infrastructure Automation – we’ll go through scripting environment using an industry standard infrastructure automation tool, Puppet.

10-03-2012

Continuous Delivery in the Cloud: Dynamic Configuration (Part 4 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application. In part 2 I went over how we use this CD pipeline to deliver software from checkin to production. In part 3, we focused on how CloudFormation is used to script the virtual AWS components that create the Manatee infrastructure. A list of topics for each of the articles is summarized below:

Part 1: Introduction – Introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – In-depth look at the CD Pipeline;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration –  What you’re reading now;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

In this part of the series, I am going to explain how we dynamically generate our configuration and avoid property files whenever possible. Instead of using property files, we store and retrieve configuration on the fly – as part of the CD pipeline – without predefining these values in a static file (i.e. a properties file) ahead of time. We do this using two methods: AWS SimpleDB and CloudFormation.

SimpleDB is a highly available non-relational data storage service that only stores strings in key value pairs. CloudFormation, as discussed in Part 3 of the series, is a scripting language for allocating and configuring AWS virtual resources.

Using SimpleDB

Throughout the CD pipeline, we often need to manage state across multiple Jenkins jobs. To do this, we use SimpleDB. As the pipeline executes, values that will be needed by subsequent jobs get stored in SimpleDB as properties. When the properties are needed we use a simple Ruby script script to return the key/value pair from SimpleDB and then use it as part of the job. The values being stored and retrieved range from IP addresses and domain names to AMI (Machine Images) IDs.

So what makes this dynamic? As Jenkins jobs or CloudFormation templates are run, we often end up with properties that need to be used elsewhere. Instead of hard coding all of the values to be used in a property file, we create, store and retrieve them as the pipeline executes.

Below is the CreateTargetEnvironment Jenkins job script that creates a new target environment from a CloudFormation script production.template


if [ $deployToProduction ] == true
then
SSH_KEY=production
else
SSH_KEY=development
fi

# Create Cloudformaton Stack
ruby /usr/share/tomcat6/scripts/aws/create_stack.rb ${STACK_NAME} ${WORKSPACE}/production.template ${HOST} ${JENKINSIP} ${SSH_KEY} ${SGID} ${SNS_TOPIC}

# Load SimpleDB Domain with Key/Value Pairs
ruby /usr/share/tomcat6/scripts/aws/load_domain.rb ${STACK_NAME}

# Pull and store variables from SimpleDB
host=`ruby /usr/share/tomcat6/scripts/aws/showback_domain.rb ${STACK_NAME} InstanceIPAddress`

# Run Acceptance Tests
cucumber features/production.feature host=${host} user=ec2-user key=/usr/share/tomcat6/.ssh/id_rsa

Referenced above in the CreateTargetEnvironment code snippet. This is the load_domain.rb script that iterates over a file and sends key/value pairs to SimpleDB.

require 'rubygems'
require 'aws-sdk'
load File.expand_path('../../config/aws.config', __FILE__)

stackname=ARGV[0]

file = File.open("/tmp/properties", "r")

sdb = AWS::SimpleDB.new

AWS::SimpleDB.consistent_reads do
  domain = sdb.domains["stacks"]
  item = domain.items["#{stackname}"]

  file.each_line do|line|
    key,value = line.split '='
    item.attributes.set(
      "#{key}" => "#{value}")
  end
end

Referenced above in the CreateTargetEnvironment code snippet. This is the showback_domain.rb script which connects to SimpleDB and returns a key/value pair.

load File.expand_path('../../config/aws.config', __FILE__)

item_name=ARGV[0]
key=ARGV[1]

sdb = AWS::SimpleDB.new

AWS::SimpleDB.consistent_reads do
  domain = sdb.domains["stacks"]
  item = domain.items["#{item_name}"]

  item.attributes.each_value do |name, value|
    if name == "#{key}"
      puts "#{value}".chomp
    end
  end
end

In the above in the CreateTargetEnvironment code snippet, we store the outputs of the CloudFormation stack in a temporary file. We then iterate over the file with the load_domain.rb script and store the key/value pairs in SimpleDB.

Following this, we make a call to SimpleDB with the showback_domain.rb script and return the instance IP address (created in the CloudFormation template) and store it in the host variable. host is then used by cucumber to ssh into the target instance and run the acceptance tests.

Using CloudFormation

In our CloudFormation templates we allocate multiple AWS resources. Every time we run the template, a different resource is being used. For example, in our jenkins.template we create a new IAM user. Every time we run the template a different IAM user with different credentials is created. We need a way to reference these resources. This is where CloudFormation comes in. You can reference resources within other resources throughout the script. You can define a reference to another resource using the Ref function in CloudFormation. Using Ref, you can dynamically refer to values of other resources such as an IP Address, domain name, etc.

In the script we are creating an IAM user, referencing the IAM user to create AWS Access keys and then storing them in an environment variable.


"CfnUser" : {
  "Type" : "AWS::IAM::User",
  "Properties" : {
    "Path": "/",
    "Policies": [{
      "PolicyName": "root",
      "PolicyDocument": {
        "Statement":[{
          "Effect":"Allow",
          "Action":"*",
          "Resource":"*"
        }
      ]}
    }]
  }
},

"HostKeys" : {
  "Type" : "AWS::IAM::AccessKey",
  "Properties" : {
    "UserName" : { "Ref": "CfnUser" }
  }
},

"# Add AWS Credentials to Tomcat\n",
"echo \"AWS_ACCESS_KEY=", { "Ref" : "HostKeys" }, "\" >> /etc/sysconfig/tomcat6\n",
"echo \"AWS_SECRET_ACCESS_KEY=", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}, "\" >> /etc/sysconfig/tomcat6\n",

We can then use these access keys in other scripts by referencing the $AWS_ACCESS_KEY and $AWS_SECRET_ACCESS_KEY environment variables.

How is this different from typical configuration management?

Typically in many organizations, there’s a big property with hard coded key/value pairs that gets passed into the pipeline. The pipeline executes using the given parameters and cannot scale or change without a user modifying the property file. They are unable to scale or adapt because all of the properties are hard coded, if the property file hard codes the IP to an EC2 instance and it goes down for whatever reason, their pipeline doesn’t work until someone fixes the property file. There are more effective ways of doing this when using the cloud. The cloud is provides on-demand resources that will constantly be changing. These resources will have different IP addresses, domain names, etc associated with them every time.

With dynamic configuration, there are no property files, every property is generated as part of the pipeline.

With this dynamic approach, the pipeline values change with every run. As new cloud resources are allocated, the pipeline is able to adjust itself and automatically without the need for users to constantly modify property files. This leads to less time spent debugging those cumbersome property file management issues that plague most companies.

In the next part of our series – which is all about Deployment Automation – we’ll go through scripting and testing your deployment using industry-standard tools. In this next article, you’ll see how to orchestrate deployment sequences and configuration using Capistrano.

09-25-2012

Continuous Delivery in the Cloud: CloudFormation (Part 3 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application. In part 2 I went over how we use this CD pipeline to deliver software from checkin to production. A list of topics for each of the articles is summarized below.

Part 1: Introduction – introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – In-depth look at the CD Pipeline
Part 3: CloudFormation – What you’re reading now
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

In this part of the series, I am going to explain how we use CloudFormation to script our AWS infrastructure and provision our Jenkins environment.

What is CloudFormation?
CloudFormation is an AWS offering for scripting AWS virtual resource allocation. A CloudFormation template is a JSON script which references various AWS resources that you want to use. When the template runs, it will allocate the AWS resources accordingly.

A CloudFormation template is split up into four sections:

  1. Parameters: Parameters are values that you define in the template. When creating the stack through the AWS console, you will be prompted to enter in values for the Parameters. If the value for the parameter generally stays the same, you can set a default value. Default values can be overridden when creating the stack. The parameter can be used throughout the template by using the “Ref” function.
  2. Mappings: Mappings are for specifying conditional parameter values in your template. For instance you might want to use a different AMI depending on the region your instance is running on. Mappings will enable you to switch AMIs depending on the region the instance is being created in.
  3. Resources: Resources are the most vital part of the CloudFormation template. Inside the resource section, you define and configure your AWS components.
  4. Outputs: After the stack resources are created successfully, you may want to have it return values such as the IP address or the domain of the created instance. You use Outputs for this. Outputs will return the values to the AWS console or command line depending on which medium you use for creating a stack.

CloudFormation parameters, and resources can be referenced throughout the template. You do this using intrinsic functions, Ref, Fn::Base64, Fn::FindInMap, Fn::GetAtt, Fn::GetAZs and Fn::Join. These functions enable you to pass properties and resource outputs throughout your template – reducing the need for most hardcoded properties (something I will discuss in part 4 of this series, Dynamic Configuration).

How do you run a CloudFormation template?
You can create a CloudFormation stack using either the AWS Console, CloudFormation CLI tools or the CloudFormation API.

Why do we use CloudFormation?
We use CloudFormation in order to have a fully scripted, versioned infrastructure. From the application to the virtual resources, everything is created from a script and is checked into version control. This gives us complete control over our AWS infrastructure which can be recreated whenever necessary.

CloudFormation for Manatees
In the Manatee Infrastructure, we use CloudFormation for setting up the Jenkins CD environment. I am going to go through each part of the jenkins template and explain its use and purpose. In template’s lifecycle, the user launches the stack using the jenkins.template and enters in the Parameters. The template then starts to work:

1. IAM User with AWS Access keys is created
2. SNS Topic is created
3. CloudWatch Alarm is created and SNS topic is used for sending alarm notifications
4. Security Group is created
5. Wait Condition created
6. Jenkins EC2 Instance is created with the Security Group from step #4. This security group is used for port configuration. It also uses AWSInstanceType2Arch and AWSRegionArch2AMI to decide what AMI and OS type to use
7. Jenkins EC2 Instance runs UserData script and executes cfn_init.
8. Wait Condition waits for Jenkins EC2 instance to finish UserData script
9. Elastic IP is allocated and associated with Jenkins EC2 instance
10. Route53 domain name created and associated with Jenkins Elastic IP
11. If everything creates successfully, the stack signals complete and outputs are displayed

Now that we know at a high level what is being done, lets take a deeper look at what’s going on inside the jenkins.template.

Parameters

  • Email: Email address that SNS notifications will be sent. When we create or deploy to target environments, we use SNS to notify us of their status.
  • ApplicationName: Name of A Record created by Route53. Inside the template, we dynamically create a domain with A record for easy access to the instance after creation. Example: jenkins.integratebutton.com, jenkins is the ApplicationName
  • HostedZone: Name of Domain used Route53. Inside the template, we dynamically create a domain with A record for easy access to the instance after creation. Example: jenkins.integratebutton.com, integratebutton.com is the HostedZone.
  • KeyName: EC2 SSH Keypair to create the Instance with. This is the key you use to ssh into the Jenkins instance after creation.
  • InstanceType: Size of the EC2 instance. Example: t1.micro, c1.medium
  • S3Bucket: We use a S3 bucket for containing the resources for the Jenkins template to use, this parameter specifies the name of the bucket to use for this.

 

Mappings


"Mappings" : {
  "AWSInstanceType2Arch" : {
    "t1.micro" : { "Arch" : "64" },
    "m1.small" : { "Arch" : "32" },
    "m1.large" : { "Arch" : "64" },
    "m1.xlarge" : { "Arch" : "64" },
    "m2.xlarge" : { "Arch" : "64" },
    "m2.2xlarge" : { "Arch" : "64" },
    "m2.4xlarge" : { "Arch" : "64" },
    "c1.medium" : { "Arch" : "64" },
    "c1.xlarge" : { "Arch" : "64" },
    "cc1.4xlarge" : { "Arch" : "64" }
  },
    "AWSRegionArch2AMI" : {
    "us-east-1" : { "32" : "ami-ed65ba84", "64" : "ami-e565ba8c" }
  }
},

These Mappings are used to define what type of operating system architecture and AWS AMI (Amazon Machine Image) ID to use to use based upon the Instance size. The instance size is specified using the Parameter InstanceType

The conditional logic to interact with the Mappings is done inside the EC2 instance.

"ImageId" : { "Fn::FindInMap" : [ "AWSRegionArch2AMI", { "Ref" : "AWS::Region" }, { "Fn::FindInMap" : [ "AWSInstanceType2Arch", { "Ref" : "InstanceType" }, "Arch" ] } ] },


Resources

AWS::IAM::User

"CfnUser" : {
  "Type" : "AWS::IAM::User",
  "Properties" : {
    "Path": "/",
    "Policies": [{
      "PolicyName": "root",
      "PolicyDocument": { "Statement":[{
        "Effect":"Allow",
        "Action":"*",
        "Resource":"*"
        }
      ]}
    }]
  }
},


"Type" : "AWS::IAM::AccessKey",
"Properties" : {
  "UserName" : { "Ref": "CfnUser" }
}

We create the AWS IAM user and then create the AWS Access and Secret access keys for the IAM user which are used throughout the rest of the template. Access and Secret access keys are authentication keys used to authenticate to the AWS account.

AWS::SNS::Topic

"MySNSTopic" : {
  "Type" : "AWS::SNS::Topic",
  "Properties" : {
    "Subscription" : [ {
      "Endpoint" : { "Ref": "Email" },
      "Protocol" : "email"
    } ]
  }
},

SNS is a highly available solution for sending notifications. In the Manatee infrastructure it is used for sending notifications to the development team.

AWS::Route53::RecordSetGroup

"JenkinsDNS" : {
  "Type" : "AWS::Route53::RecordSetGroup",
  "Properties" : {
    "HostedZoneName" : { "Fn::Join" : [ "", [ {"Ref" : "HostedZone"}, "." ]]},
    "RecordSets" : [{
      "Name" : { "Fn::Join" : ["", [ { "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }, "." ]]},
      "Type" : "A",
      "TTL" : "900",
      "ResourceRecords" : [ { "Ref" : "IPAddress" } ]
    }]
  }
},

Route53 is a highly available DNS service. We use Route53 to create domains dynamically using the given HostedZone and ApplicationName parameters. If the parameters are not overriden, the domain jenkins.integratebutton.com will be created. We then reference the Elastic IP and associate it with the created domain. This way the jenkins.integratebutton.com domain will route to the created instance

AWS::EC2::Instance

EC2 gives access to on-demand compute resources. In this template, we allocate a new EC2 instance and configure it with a Keypair, Security Group, and Image ID (AMI). Then for provisioning the EC2 instance we use the UserData property. Inside UserData we run a set of bash commands along with cfn_init. The UserData script is run during instance creation.

"WebServer": {
  "Type": "AWS::EC2::Instance",
  "Metadata" : {
    "AWS::CloudFormation::Init" : {
      "config" : {
        "packages" : {
          "yum" : {
            "tomcat6" : [],
            "subversion" : [],
            "git" : [],
            "gcc" : [],
            "libxslt-devel" : [],
            "ruby-devel" : [],
            "httpd" : []
          }
        },

        "sources" : {
          "/opt/aws/apitools/cfn" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/resources/aws_tools/cfn-cli.tar.gz"]]},
          "/opt/aws/apitools/sns" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/resources/aws_tools/sns-cli.tar.gz"]]}
        },

        "files" : {
          "/usr/share/tomcat6/webapps/jenkins.war" : {
            "source" : "http://mirrors.jenkins-ci.org/war/1.480/jenkins.war",
            "mode" : "000700",
            "owner" : "tomcat",
            "group" : "tomcat",
            "authentication" : "S3AccessCreds"
          },

          "/usr/share/tomcat6/webapps/nexus.war" : {
            "source" : "http://www.sonatype.org/downloads/nexus-2.0.3.war",
            "mode" : "000700",
            "owner" : "tomcat",
            "group" : "tomcat",
            "authentication" : "S3AccessCreds"
          },

          "/usr/share/tomcat6/.ssh/id_rsa" : {
            "source" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/private/id_rsa"]]},
            "mode" : "000600",
            "owner" : "tomcat",
            "group" : "tomcat",
            "authentication" : "S3AccessCreds"
          },

          "/home/ec2-user/common-step-definitions-1.0.0.gem" : {
            "source" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/gems/common-step-definitions-1.0.0.gem"]]},
            "mode" : "000700",
            "owner" : "root",
            "group" : "root",
            "authentication" : "S3AccessCreds"
          },

          "/etc/cron.hourly/jenkins_backup.sh" : {
            "source" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/jenkins_backup.sh"]]},
            "mode" : "000500",
            "owner" : "root",
            "group" : "root",
            "authentication" : "S3AccessCreds"
          },

          "/etc/tomcat6/server.xml" : {
            "source" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/server.xml"]]},
            "mode" : "000554",
            "owner" : "root",
            "group" : "root",
            "authentication" : "S3AccessCreds"
          },

          "/usr/share/tomcat6/aws_access" : {
            "content" : { "Fn::Join" : ["", [
              "AWSAccessKeyId=", { "Ref" : "HostKeys" }, "\n",
              "AWSSecretKey=", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}
            ]]},
            "mode" : "000400",
            "owner" : "tomcat",
            "group" : "tomcat",
            "authentication" : "S3AccessCreds"
          },

          "/opt/aws/aws.config" : {
            "content" : { "Fn::Join" : ["", [
              "AWS.config(\n",
              ":access_key_id => \"", { "Ref" : "HostKeys" }, "\",\n",
              ":secret_access_key => \"", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}, "\")\n"
            ]]},
            "mode" : "000500",
            "owner" : "tomcat",
            "group" : "tomcat"
          },

          "/etc/httpd/conf/httpd.conf2" : {
            "content" : { "Fn::Join" : ["", [
              "NameVirtualHost *:80\n",
              "\n",
              "ProxyPass /jenkins http://", { "Fn::Join" : ["", [{ "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }]] }, ":8080/jenkins\n",
              "ProxyPassReverse /jenkins http://", { "Fn::Join" : ["", [{ "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }]] }, ":8080/jenkins\n",
              "ProxyRequests Off\n",

              "\n",
              "Order deny,allow\n",
              "Allow from all\n",
              "\n",
              "RewriteEngine On\n",
              "RewriteRule ^/$ http://", { "Fn::Join" : ["", [{ "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }]] }, ":8080/jenkins$1 [NC,P]\n",
""
            ]]},
            "mode" : "000544",
            "owner" : "root",
            "group" : "root"
          },

          "/root/.ssh/config" : {
            "content" : { "Fn::Join" : ["", [
              "Host github.com\n",
              "StrictHostKeyChecking no\n"
            ]]},
            "mode" : "000600",
            "owner" : "root",
            "group" : "root"
          },

          "/usr/share/tomcat6/.route53" : {
            "content" : { "Fn::Join" : ["", [
              "access_key: ", { "Ref" : "HostKeys" }, "\n",
              "secret_key: ", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}, "\n",
              "api: '2012-02-29'\n",
              "endpoint: https://route53.amazonaws.com/\n",
              "default_ttl: '3600'"
            ]]},
            "mode" : "000700",
            "owner" : "tomcat",
            "group" : "tomcat"
          }
        }
      }
    },
    "AWS::CloudFormation::Authentication" : {
      "S3AccessCreds" : {
        "type" : "S3",
        "accessKeyId" : { "Ref" : "HostKeys" },
        "secretKey" : {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]},
        "buckets" : [ { "Ref" : "S3Bucket"} ]
      }
    }
  },
  "Properties": {
    "ImageId" : { "Fn::FindInMap" : [ "AWSRegionArch2AMI", { "Ref" : "AWS::Region" }, { "Fn::FindInMap" : [ "AWSInstanceType2Arch", { "Ref" : "InstanceType" }, "Arch" ] } ] },
    "InstanceType" : { "Ref" : "InstanceType" },
    "SecurityGroups" : [ {"Ref" : "FrontendGroup"} ],
    "KeyName" : { "Ref" : "KeyName" },
    "Tags": [ { "Key": "Name", "Value": "Jenkins" } ],
    "UserData" : { "Fn::Base64" : { "Fn::Join" : ["", [
      "#!/bin/bash -v\n",
      "yum -y install java-1.6.0-openjdk*\n",
      "yum update -y aws-cfn-bootstrap\n",

      "# Install packages\n",
      "/opt/aws/bin/cfn-init -s ", { "Ref" : "AWS::StackName" }, " -r WebServer ",
      " --access-key ", { "Ref" : "HostKeys" },
      " --secret-key ", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]},
      " --region ", { "Ref" : "AWS::Region" }, " || error_exit 'Failed to run cfn-init'\n",

      "# Copy Github credentials to root ssh directory\n",
      "cp /usr/share/tomcat6/.ssh/* /root/.ssh/\n",

      "# Installing Ruby 1.9.3 from RPM\n",
      "wget -P /home/ec2-user/ https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/resources/rpm/ruby-1.9.3p0-2.amzn1.x86_64.rpm\n",
      "rpm -Uvh /home/ec2-user/ruby-1.9.3p0-2.amzn1.x86_64.rpm\n",

      "cat /etc/httpd/conf/httpd.conf2 >> /etc/httpd/conf/httpd.conf\n",

      "# Install S3 Gems\n",
      "gem install /home/ec2-user/common-step-definitions-1.0.0.gem\n",

      "# Install Public Gems\n",
      "gem install bundler --version 1.1.4 --no-rdoc --no-ri\n",
      "gem install aws-sdk --version 1.5.6 --no-rdoc --no-ri\n",
      "gem install cucumber --version 1.2.1 --no-rdoc --no-ri\n",
      "gem install net-ssh --version 2.5.2 --no-rdoc --no-ri\n",
      "gem install capistrano --version 2.12.0 --no-rdoc --no-ri\n",
      "gem install route53 --version 0.2.1 --no-rdoc --no-ri\n",
      "gem install rspec --version 2.10.0 --no-rdoc --no-ri\n",
      "gem install trollop --version 2.0 --no-rdoc --no-ri\n",

      "# Update Jenkins with versioned configuration\n",
      "rm -rf /usr/share/tomcat6/.jenkins\n",
      "git clone git@github.com:stelligent/continuous_delivery_open_platform_jenkins_configuration.git /usr/share/tomcat6/.jenkins\n",

      "# Get S3 bucket publisher from S3\n",
      "wget -P /usr/share/tomcat6/.jenkins/ https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/hudson.plugins.s3.S3BucketPublisher.xml\n",

      "wget -P /tmp/ https://raw.github.com/stelligent/continuous_delivery_open_platform/master/config/aws/cd_security_group.rb\n",
      "ruby /tmp/cd_security_group --securityGroupName ", { "Ref" : "FrontendGroup" }, " --port 5432\n",

      "# Update main Jenkins config\n",
      "sed -i 's@.*@", { "Ref" : "HostKeys" }, "@' /usr/share/tomcat6/.jenkins/hudson.plugins.s3.S3BucketPublisher.xml\n",
      "sed -i 's@.*@", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}, "@' /usr/share/tomcat6/.jenkins/hudson.plugins.s3.S3BucketPublisher.xml\n",

      "# Add AWS Credentials to Tomcat\n",
      "echo \"AWS_ACCESS_KEY=", { "Ref" : "HostKeys" }, "\" >> /etc/sysconfig/tomcat6\n",
      "echo \"AWS_SECRET_ACCESS_KEY=", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}, "\" >> /etc/sysconfig/tomcat6\n",

      "# Add AWS CLI Tools\n",
      "echo \"export AWS_CLOUDFORMATION_HOME=/opt/aws/apitools/cfn\" >> /etc/sysconfig/tomcat6\n",
      "echo \"export AWS_SNS_HOME=/opt/aws/apitools/sns\" >> /etc/sysconfig/tomcat6\n",
      "echo \"export PATH=$PATH:/opt/aws/apitools/sns/bin:/opt/aws/apitools/cfn/bin\" >> /etc/sysconfig/tomcat6\n",

      "# Add Jenkins Environment Variable\n",
      "echo \"export SNS_TOPIC=", { "Ref" : "MySNSTopic" }, "\" >> /etc/sysconfig/tomcat6\n",
      "echo \"export JENKINS_DOMAIN=", { "Fn::Join" : ["", ["http://", { "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }]] }, "\" >> /etc/sysconfig/tomcat6\n",
      "echo \"export JENKINS_ENVIRONMENT=", { "Ref" : "ApplicationName" }, "\" >> /etc/sysconfig/tomcat6\n",

      "wget -P /tmp/ https://raw.github.com/stelligent/continuous_delivery_open_platform/master/config/aws/showback_domain.rb\n",
      "echo \"export SGID=`ruby /tmp/showback_domain.rb --item properties --key SGID`\" >> /etc/sysconfig/tomcat6\n",

      "chown -R tomcat:tomcat /usr/share/tomcat6/\n",
      "chmod +x /usr/share/tomcat6/scripts/aws/*\n",
      "chmod +x /opt/aws/apitools/cfn/bin/*\n",

      "service tomcat6 restart\n",
      "service httpd restart\n",

      "/opt/aws/bin/cfn-signal", " -e 0", " '", { "Ref" : "WaitHandle" }, "'"
    ]]}}
  }
},

Calling cfn init from UserData


"# Install packages\n",
"/opt/aws/bin/cfn-init -s ", { "Ref" : "AWS::StackName" }, " -r WebServer ",
" --access-key ", { "Ref" : "HostKeys" },
" --secret-key ", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]},
" --region ", { "Ref" : "AWS::Region" }, " || error_exit 'Failed to run cfn-init'\n",
},

cfn_init is used to retrieve and interpret the resource metadata, installing packages, creating files and starting services. In the Manatee template we use cfn_init for easy access to other AWS resources, such as S3.

"/etc/tomcat6/server.xml" : {
  "source" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/server.xml"]]},
  "mode" : "000554",
  "owner" : "root",
  "group" : "root",
  "authentication" : "S3AccessCreds"
},


"AWS::CloudFormation::Authentication" : {
  "S3AccessCreds" : {
    "type" : "S3",
    "accessKeyId" : { "Ref" : "HostKeys" },
    "secretKey" : {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]},
    "buckets" : [ { "Ref" : "S3Bucket"} ]
  }
}

When possible, we try to use cfn_init rather than UserData bash commands because it stores a detailed log of Cfn events on the instance.

AWS::EC2::SecurityGroup

When creating a Jenkins instance, we only want certain ports to be open and only open to certain users. For this we use Security Groups. Security groups are firewall rules defined at the AWS level. You can use them to set which ports, or range of ports to be opened. In addition to defining which ports are to be open, you can define who they should be open to using CIDR.


"FrontendGroup" : {
  "Type" : "AWS::EC2::SecurityGroup",
  "Properties" : {
    "GroupDescription" : "Enable SSH and access to Apache and Tomcat",
    "SecurityGroupIngress" : [
      {"IpProtocol" : "tcp", "FromPort" : "22", "ToPort" : "22", "CidrIp" : "0.0.0.0/0"},
      {"IpProtocol" : "tcp", "FromPort" : "8080", "ToPort" : "8080", "CidrIp" : "0.0.0.0/0"},
      {"IpProtocol" : "tcp", "FromPort" : "80", "ToPort" : "80", "CidrIp" : "0.0.0.0/0"}
    ]
  }
},

In this security group we are opening ports 22, 80 and 8080. Since we are opening 8080, we are able to access Jenkins at the completion of the template. By default, ports on an instance are closed, meaning these are necessary to be specified in order to have access to Jenkins.

AWS::EC2::EIP

When an instance is created, it is given a public DNS name similar to: ec2-107-20-139-148.compute-1.amazonaws.com. By using Elastic IPs, you can associate your instance an IP rather than a DNS.


"IPAddress" : {
  "Type" : "AWS::EC2::EIP"
},

"IPAssoc" : {
  "Type" : "AWS::EC2::EIPAssociation",
  "Properties" : {
    "InstanceId" : { "Ref" : "WebServer" },
    "EIP" : { "Ref" : "IPAddress" }
  }
},

In the snippets above, we create a new Elastic IP and then associate it with the EC2 instance created above. We do this so we can reference the Elastic IP when creating the Route53 Domain name.

AWS::CloudWatch::Alarm

"CPUAlarmLow": {
  "Type": "AWS::CloudWatch::Alarm",
  "Properties": {
    "AlarmDescription": "Scale-down if CPU < 70% for 10 minutes",
    "MetricName": "CPUUtilization",
    "Namespace": "AWS/EC2",
    "Statistic": "Average",
    "Period": "300",
    "EvaluationPeriods": "2",
    "Threshold": "70",
    "AlarmActions": [ { "Ref": "SNSTopic" } ],
    "Dimensions": [{
      "Name": "WebServerName",
      "Value": { "Ref": "WebServer" }
    }],
    "ComparisonOperator": "LessThanThreshold"
  }
},

There are many reasons an instance can become unavailable. CloudWatch is used to monitor instance usage and performance. CloudWatch can be set to notify specified individuals if the instance experiences higher than normal CPU utilization, disk usage, network usage, etc. In the Manatee infrastructure we use CloudWatch to monitor disk utilization and notify team members if it reaches 90 percent.

If the Jenkins instance goes down, our CD pipeline becomes temporarily unavailable. This presents a problem as the development team is temporarily blocked from testing their code. CloudWatch helps notify us if this is an impending problem..

AWS::CloudFormation::WaitConditionHandle, AWS::CloudFormation::WaitCondition

Wait Conditions are used to wait for all of the resources in a template to be completed before signally template success.

"WaitHandle" : {
  "Type" : "AWS::CloudFormation::WaitConditionHandle"
},

"WaitCondition" : {
  "Type" : "AWS::CloudFormation::WaitCondition",
  "DependsOn" : "WebServer",
  "Properties" : {
    "Handle" : { "Ref" : "WaitHandle" },
    "Timeout" : "990"
  }
}

When creating the instance, if a wait condition is not used, CloudFormation won’t wait for the completion of the UserData script. It will signal success if the EC2 instance is allocated successfully rather than waiting for the UserData script to run and signal success.

Outputs

Outputs are used to return information from what was created during the CloudFormaiton stack creation to the user. In order to return values, you define the Output name and then the resource you want to reference:


"Outputs" : {
  "Domain" : {
    "Value" : { "Fn::Join" : ["", ["http://", { "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }]] },
    "Description" : "URL for newly created Jenkins app"
  },
  "NexusURL" : {
    "Value" : { "Fn::Join" : ["", ["http://", { "Ref" : "IPAddress" }, ":8080/nexus"]] },
    "Description" : "URL for newly created Nexus repository"
  },
  "InstanceIPAddress" : {
    "Value" : { "Ref" : "IPAddress" }
  }
}

For instance with the InstanceIPAddress, we are refernceing the IPAddress resource which happens to be the Elastic IP. This will return the Elastic IP address to the CloudFormation console.

CloudFormation allows us to completely script and version our infrastructure. This enables our infrastructure to be recreated the same way every time by just running the CloudFormation template. Because of this, your environments can be run in a Continuous integration cycle, rebuilding with every change in the script.

In the next part of our series – which is all about Dynamic Configuration – we’ll go through building your infrastructure to only require a minimal amount of hard coded properties if any. In this next article, you’ll see how you can use CloudFormation to build “property file less” infrastructure.

Resources:

09-18-2012

Continuous Delivery in the Cloud: CD Pipeline (Part 2 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application and how we use this pipeline to deliver software from checkin to production. In this article I will take an in-depth look at the CD pipeline. A list of topics for each of the articles is summarized below.

Part 1: Introduction – Introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – What you’re reading now;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

The CD pipeline consists of five Jenkins jobs. These jobs are configured to run one after the other. If any one of the jobs fail, the pipeline fails and that release candidate cannot be released to production. The five Jenkins jobs are listed below (further details of these jobs are provided later in the article).

  1. 1) A job that set the variables used throughout the pipeline (SetupVariables)
  2. 2) Build job (Build)
  3. 3) Production database update job (StoreLatestProductionData)
  4. 4) Target environment creation job (CreateTargetEnvironment)
  5. 5) A deployment job (DeployManateeApplication) which enables a one-click deployment into production.

We used Jenkins plugins to add additional features to the core Jenkins configuration. You can extend the standard Jenkins setup by using Jenkins plugins. A list of the plugins we use for the Sea to Shore Alliance Continuous Delivery configuration are listed below.

Grails: http://updates.jenkins-ci.org/download/plugins/grails/1.5/grails.hpi
Groovy: http://updates.jenkins-ci.org/download/plugins/groovy/1.12/groovy.hpi
Subversion: http://updates.jenkins-ci.org/download/plugins/subversion/1.40/subversion.hpi
Paramterized Trigger: http://updates.jenkins-ci.org/download/plugins/parameterized-trigger/2.15/parameterized-trigger.hpi
Copy Artifact: http://updates.jenkins-ci.org/download/plugins/copyartifact/1.21/copyartifact.hpi
Build Pipeline: http://updates.jenkins-ci.org/download/plugins/build-pipeline-plugin/1.2.3/build-pipeline-plugin.hpi
Ant: http://updates.jenkins-ci.org/download/plugins/ant/1.1/ant.hpi
S3: http://updates.jenkins-ci.org/download/plugins/s3/0.2.0/s3.hpi

The parameterized trigger, build pipeline and S3 plugins are used for moving the application through the pipeline jobs. The Ant, Groovy, and Grails plugins are used for running the build for the application code. Subversion for polling and checking out from version control.

Below, I describe each of the jobs that make up the CD pipeline in greater detail.

SetupVariables: Jenkins job used for entering in necessary property values which are propagated along the rest of the pipeline.

Parameter: STACK_NAME
Type: String
Where: Used in both CreateTargetEnvironment and DeployManateeApplication jobs
Purpose: Defines the CloudFormation Stack name and SimpleDB property domain associated with the CloudFormation stack.

Parameter: HOST
Type: String
Where: Used in both CreateTargetEnvironment and DeployManateeApplication jobs
Purpose: Defines the CNAME of the domain created in the CreateTargetEnvironment job. The DeployManateeApplication job uses it when it dynamically creates configuration files. For instance, in test.oneclickdeployment.com, test would be the HOST

Parameter: PRODUCTION_IP
Type: String
Where: Used in the StoreProductionData job
Purpose: Sets the production IP for the job so that it can SSH into the existing production environment and run a database script that exports the data and uploads it to S3.

Parameter: deployToProduction
Type: Boolean
Where: Used in both CreateTargetEnvironment and DeployManateeApplication jobs
Purpose: Determines whether to use the development or production SSH keypair.

In order for the parameters to propagate through the pipeline, we pass the current build parameters using the parametrized build trigger plugin

Build: Compiles the Manatee application’s Grails source code and creates a WAR file.

To do this, we utilize a Jenkins grails plugin and run grails targets such as compile and prod war. Next, we archive the grails migrations for use in the DeployManateeApplication job and then the job pushes the Manatee WAR up to S3 which is used as an artifact repository.

Lastly, using the trigger parametrized build plugin, we trigger the StoreProductionData job with the current build parameters.

StoreProductionData: This job performs a pg dump (PostgreSQL dump) of the production database and then stores it up in S3 for the environment creation job to use when building up the environment. Below is a snippet from this job.

ssh -i /usr/share/tomcat6/development.pem -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no ec2-user@${PRODUCTION_IP} ruby /home/ec2-user/database_update.rb

On the target environments created using the CD pipeline, a database script is stored. The script goes into the PostgreSQL database and runs a pg_dump. It then pushes the pg_dump SQL file to S3 to be used when creating the target environment.

After the SQL file is stored successfully, the CreateTargetEnvironment job is triggered.

CreateTargetEnvironment: Creates a new target environment using a CloudFormation template to create all the AWS resources and calls puppet to provision the environment itself from a base operating system to a fully working target environment ready for deployment. Below is a snippet from this job.

if [ $deployToProduction ]
then
SSH_KEY=development
else
SSH_KEY=production
fi

# Create Cloudformaton Stack
ruby ${WORKSPACE}/config/aws/create_stack.rb ${STACK_NAME} ${WORKSPACE}/infrastructure/manatees/production.template ${HOST} ${JENKINSIP} ${SSH_KEY} ${SGID} ${SNS_TOPIC}

# Load SimpleDB Domain with Key/Value Pairs
ruby ${WORKSPACE}/config/aws/load_domain.rb ${STACK_NAME}

# Pull and store variables from SimpleDB
host=`ruby ${WORKSPACE}/config/aws/showback_domain.rb ${STACK_NAME} InstanceIPAddress`

# Run Acceptance Tests
cucumber ${WORKSPACE}/infrastructure/manatees/features/production.feature host=${host} user=ec2-user key=/usr/share/tomcat6/.ssh/id_rsa

# Publish notifications to SNS
sns-publish --topic-arn $SNS_TOPIC --subject "New Environment Ready" --message "Your new environment is ready. IP Address: $host. An example command to ssh into the box would be: ssh -i development.pem ec2-user@$host This instance was created by $JENKINS_DOMAIN" --aws-credential-file /usr/share/tomcat6/aws_access

Once the environment is created, a set of Cucumber tests is run to ensure it’s in the correct working state. If any test fails, the entire pipeline fails and the developer is notified something went wrong. Otherwise if it passes, the DeployManateeApplication job is kicked off and an AWS SNS email notification with information to access the new instance is sent to the developer.

DeployManateeApplication: Runs a Capistrano script which uses steps in order to coordinate the deployment. A snippet from this job is displayed below.

if [ !$deployToProduction ]
then
SSH_KEY=/usr/share/tomcat6/development.pem
else
SSH_KEY=/usr/share/tomcat6/production.pem
fi

#/usr/share/tomcat6/.ssh/id_rsa

cap deploy:setup stack=${STACK_NAME} key=${SSH_KEY}

sed -i "s@manatee0@${HOST}@" ${WORKSPACE}/deployment/features/deployment.feature

host=`ruby ${WORKSPACE}/config/aws/showback_domain.rb ${STACK_NAME} InstanceIPAddress`
cucumber deployment/features/deployment.feature host=${host} user=ec2-user key=${SSH_KEY} artifact=

sns-publish --topic-arn $SNS_TOPIC --subject "Manatee Application Deployed" --message "Your Manatee Application has been deployed successfully. You can view it by going to http://$host/wildtracks This instance was deployed to by $JENKINS_DOMAIN" --aws-credential-file /usr/share/tomcat6/aws_access

This deployment job is the final piece of the delivery pipeline, it pulls together all of the pieces created in the previous jobs to successfully deliver working software.

During the deployment, the Capistrano script SSH’s into the target server, deploys the new war and updated configuration changes and restarts all services. Then the Cucumber tests are run to ensure the application is available and running successfully. Assuming the tests pass, an AWS SNS email gets dispatched to the developer with information on how to access their new development application

We use Jenkins as the orchestrator of the pipeline. Jenkins executes a set of scripts and passes around parameters as it runs each job. Because of the role Jenkins plays, we want to make sure it’s treated the same way as application – meaning versioning and testing all of our changes to the system. For example, if a developer modifies the create environment job configuration, we want to have the ability to revert back if necessary. Due to this requirement we version the Jenkins configuration. The jobs, plugins and main configuration. To do this, a script is executed each hour using cron.hourly that checks for new jobs or updated configuration and commits them up to version control.

The CD pipeline that we have built for the Manatee application enables any change in the application, infrastructure, database or configuration to move through to production seamlessly using automation. This allows any new features, security fixes, etc. to be fully tested as it gets delivered to production at the click of a button.

In the next part of our series – which is all about using CloudFormation – we’ll go through a CloudFormation template used to automate the creation of a Jenkins environment. In this next article, you’ll see how CloudFormation procures AWS resources and provisions our Jenkins CD Pipeline environment.

Continuous Delivery in the Cloud Case Study for the Sea to Shore Alliance – Introduction (part 1 of 6)

We help companies deliver software reliably and repeatedly using Continuous Delivery in the Cloud. With Continuous Delivery (CD), teams can deliver new versions of software to production by flattening the software delivery process and decreasing the cycle time between an idea and usable software through the automation of the entire delivery system: build, deployment, test, and release. CD is enabled through a delivery pipeline. With CD, our customers can choose when and how often to release to production. On top of this, we utilize the cloud so that customers can scale their infrastructure up and down and deliver software to users on demand.

Stelligent offers a solution called Elastic Operations which provides a Continuous Delivery platform along with expert engineering support and monitoring of a delivery pipeline that builds, tests, provisions and deploys software to target environments – as often as our customers choose. We’re in the process of open sourcing the platform utilized by Elastic Operations.

In this six-part blog series, I am going to go over how we built out a Continuous Delivery solution for the Sea to Shore Alliance:

Part 1: Introduction – What you’re reading now;
Part 2: CD Pipeline – Automated pipeline to build, test, deploy, and release software continuously;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

This year, we delivered this Continuous Delivery in the Cloud solution to the Sea to Shore Alliance. The Sea to Shore Alliance is a non-profit organization whose mission is to protect and conserve the world’s fragile coastal ecosystems and its endangered species such as manatees, sea turtles, and right whales. One of their first software systems tracks and monitors manatees. Prior to Stelligent‘s involvement, the application was running on a single instance that was manually provisioned and deployed. As a result of the manual processes, there were no automated tests for the infrastructure or deployment. This made it impossible to reproduce environments or deployments the same way every time. Moreover, the knowledge to recreate these environments, builds and deployments were locked in the heads of a few key individuals. The production application for tracking these Manatees, developed by Sarvatix, is located here.

In this case study, I describe how we went from an untested manual process in which the development team was manually building software artifacts, creating environments and deploying, to a completely automated delivery pipeline that is triggered with every change.

Figure 1 illustrates the AWS architecture of the infrastructure that we designed for this Continuous Delivery solution.

There are two CloudFormation stacks being used, the Jenkins stack – or Jenkins environment – as shown on the left and the Manatee stack – or Target environment – as shown on the right.

The Jenkins Stack

  1. * Creates the jenkins.example.com Route53 Hosted Zone
  2. * Creates an EC2 instance with Tomcat and Jenkins installed and configured on it.
  3. * Runs the CD Pipeline

The Manatee stack is slightly different, it utilizes the configuration provided by SimpleDB to create itself. This stack defines the target environment for which the application software is deployed.

The Manatee Stack

  1. * Creates the manatee.example.com Route53 Hosted Zone
  2. * Creates an EC2 instance with Tomcat, Apache, PostgreSQL installed on it.
  3. * Runs the Manatee application.

The Manatee stack is configured with CPU alarms that send an email notification to the developers/administrators when it becomes over-utilized. We’re in the process of scaling to additional instances when these types of alarms are triggered.

Both instances are encapsulated behind a security group so that they can talk between each other using the internal AWS network.

Fast Facts
Industry: Non-Profit
Profile: Customer tracks and monitors endangered species such as manatees.
Key Business Issues: The customer’s development team needed to have unencumbered access to resources along with automated environment creation and deployment.
Stakeholders: Development team and scientists and others from the Sea to Shore Alliance
Solution: Continuous Delivery in the Cloud (Elastic Operations)
Key Tools/Technologies: AWS – Amazon Web Services (CloudFormation, EC2, S3, SimpleDB, IAM, CloudWatch, SNS), Jenkins, Capistrano, Puppet, Subversion, Cucumber, Liquibase

The Business Problem
The customer needed an operations team that could be scaled up or down depending on the application need. The customer’s main requirements were to have unencumbered access to resources such as virtual hardware. Specifically, they wanted to have the ability to create a target environment and run an automated deployment to it without going to a separate team and submitting tickets, emails, etc. In addition to being able to create environments, the customer wanted to have more control over the resources being used; they wanted to have the ability to terminate resources if they were unused. To address these requirements we introduced an entirely automated solution which utilizes the AWS cloud for providing resources on-demand, along with other solutions for providing testing, environment provisioning and deployment.

On the Manatee project, we have five key objectives for the delivery infrastructure. The development team should be able to:

  1. * Deliver new software or updates to users on demand
  2. * Reprovision target environment configuration on demand
  3. * Provision environments on demand
  4. * Remove configuration bottlenecks
  5. * Ability for users to terminate instances

Our Team
Stelligent’s team consisted of an account manager and one polyskilled DevOps Engineer that built, managed, and supported the Continuous Delivery pipeline.

Our Solution
Our solution, a single delivery pipeline that gives our customer (developers, testers, etc.) unencumbered access to resources and a single click automated deployment to production. To enable this, the pipeline needed to include:

  1. * The ability for any authorized team member to create a new target environment using a single click
  2. * Automated deployment to the target environment
  3. * End-to-end testing
  4. * The ability to terminate unnecessary environments
  5. * Automated deployment into production with a single click

The delivery pipeline improves efficiency and reduces costs by not limiting the development team. The solution includes:

  • On-Demand Provisioning – All hardware is provided via EC2’s virtual instances in the cloud, on demand. As part of the CD pipeline, any authorized team member can use the Jenkins CreateTargetEnvironment job to order target environments for development work.
  • Continuous Delivery Solution so that the team can deliver software to users on demand:
  • Development Infrastructure – Consists of:
    • Tomcat: used for hosting the Manatee Application
    • Apache: Hosted the front-end website and used virtual hosts for proxying and redirection.
    • PostgreSQL: Database for the Manatee application
    • Groovy: the application is written in Grails which uses Groovy.
  • Instance Management – Any authorized team member is able to monitor virtual instance usage by viewing Jenkins. There is a policy that test instances are automatically terminated every two days. This promotes ephemeral environments and test automation.
  • Deployment to Production – There’s a boolean value (i.e. a checkbox the user selects) in the delivery pipeline used for deciding whether to deploy to production.
  • System Monitoring and Disaster Recovery – Using the AWS CloudWatch service, AWS provides us with detailed monitoring to notify us of instance errors or anomalies through statistics such as CPU utilization, Network IO, Disk utilization, etc. Using these solutions we’ve implemented an automated disaster recovery solution.


A list of the AWS tools we utilized are enumerated below.

Tool: AWS EC2
What is it? Cloud-based virtual hardware instances
Our Use: We use EC2 for all of our virtual hardware needs. All instances, from development to production are run on EC2

Tool: AWS S3
What is it? Cloud-based storage
Our Use: We use S3 as both a binary repository and a place to store successful build artifacts.

Tool:  AWS IAM
What is it? User-based access to AWS resources
Our Use: We create users dynamically and use their AWS access and secret access keys so we don’t have to store credentials as properties

Tool: AWS CloudWatch
What is it? System monitoring
Our Use: Monitors all instances in production. If an instance takes an abnormal amount of strain or shuts down unexpectedly, SNS sends an email to designated parties

Tool: AWS SNS
What is it? Email notifications
Our Use: When an environment is created or a deployment is run, SNS is used to send notifications to affected parties.

Tool: Cucumber
What is it? Acceptance testing
Our Use: Cucumber is used for testing at almost every step of the way. We use Cucumber to test infrastructure, deployments and application code to ensure correct functionality. Cucumber’s unique english-ess  verbiage allows both technical personnel and customers to communicate using an executable test.

Tool: Liquibase
What is it? Automated database change management
Our Use: Liquibase is used for all database changesets. When a change is necessary within the database, it is made to a liquibase changelog.xml

Tool: AWS CloudFormation
What is it? Templating language for orchestrating all AWS resources
Our Use: CloudFormation is used for creating a fully working Jenkins environment and Target environment. For instance for the Jenkins environment it creates the EC2 instance with CloudWatch monitoring alarms, associated IAM user, SNS notification topic, everything required for Jenkins to build. This along with Jenkins are the major pieces of the infrastructure.

Tool: AWS SimpleDB
What is it? Cloud-based NoSQL database
Our Use: SimpleDB is used for storing dynamic property configuration and passing properties through the CD Pipeline. As part of the environment creation process, we store multiple values such as IP addresses that we need when deploying the application to the created environment.

Tool: Jenkins
What is it? We’re using Jenkins to implement a CD pipeline using the Build Pipeline plugin.
Our Use: Jenkins runs the CD pipeline which does the building, testing, environment creation and deploying. Since the CD pipeline is also code (i.e. configuration code), we version our Jenkins configuration.

Tool: Capistrano
What is it? Deployment automation
Our Use: Capistrano orchestrates and automates deployments. Capistrano is a Ruby-based deployment DSL that can be used to deploy to multiple platforms including Java, Ruby and PHP. It is called as part of the CD pipeline and deploys to the target environment.

Tool: Puppet
What is it? Infrastructure automation
Our Use: Puppet takes care of the environment provisioning. CloudFormation requests the environment and then calls Puppet to do the dynamic configuration. We configured Puppet to install, configure, and manage the packages, files and services.

Tool: Subversion
What is it? Version control system
Our Use: Subversion is the version control repository where every piece of the Manatee infrastructure is stored. This includes the environment scripts such as the Puppet modules, the CloudFormation templates, Capistrano deployment scripts, etc.

We applied the on-demand usability of the cloud with a proven continuous delivery approach to build an automated one click method for building and deploying software into scripted production environments.

In the blog series, I will describe the technical implementation of how we went about building this infrastructure into a complete solution for continuously delivering software. This series will consist of the following:

Part 2 of 6 – CD Pipeline: I will go through the technical implementation of the CD pipeline using Jenkins. I will also cover Jenkins versioning, pulling and pushing artifacts from S3, and Continuous Integration.

Part 3 of 6 – CloudFormation: I will go through a CloudFormation template we’re using to orchestrate the creation of AWS resources and to build the Jenkins and target infrastructure.

Part 4 of 6 – Dynamic Configuration: Will cover dynamic property configuration using SimpleDB

Part 5 of 6 – Deployment Automation: I will explain Capistrano in detail along how we used Capistrano to deploy build artifacts and run Liquibase database changesets against target environments

Part 6 of 6 – Infrastructure Automation: I will describe the features of Puppet in detail along with how we’re using Puppet to build and configure target environments – for which the software is deployed.

07-30-2012

NetFlix Unleashes Chaos Monkey – The First in its Simian Army

Today, NetFlix announced its first open source release in its Simian Army – the Chaos Monkey. The Chaos Monkey assumes that everything will fail…eventually. The Chaos Monkey runs, by default, on your Amazon Web Services’ (AWS) infrastructure and randomly terminates instances in Auto Scaling Groups. We wrote about some of the benefits of treating instances ephemerally a few months back.

Chaos Monkey

We’ve embraced the concept of “disposable environments” for years with our customers; it’s nice to see a company not only publicly embracing the principle, but providing tools for the community to use. Looking forward to seeing more releases in NetFlix’ Simian Army!

Continuous Delivery in the Cloud Case Study

A Case Study on using 100% Cloud-based Resources with Automated Software Delivery

We help – typically large – organizations create one-click software delivery systems so that they can deliver software in a more rapid, reliable and repeatable manner (AKA Continuous Delivery). The only way this works is when Development works with Operations. As has been written elsewhere in this series, this means changing the hearts and minds of people because most organizations are used to working in ‘siloed’ environments. In this entry, I focus on implementation, by describing a real-world case study in which we have brought Continuous Delivery Operations to the Cloud consisting of a team of Systems and Software Engineers.  

For years, we’ve helped customers in Continuous Integration and Testing so more of our work was with Developers and Testers. Several years ago, we hired a Sys Admin/Engineer/DBA who was passionate about automation. As a result of this, we began assembling multiple two-person “DevOps” teams consisting of a Software Engineer and a Systems Engineer both of whom being big-picture thinkers and not just “Developers” or “Sys Admins”. These days, we put together these targeted teams of Continuous Delivery and Cloud experts with hands-on experience as Software Engineers and Systems Engineers so that organizations can deliver software as quickly and as often as the business requires.

A couple of years ago we already had a few people in the company who were experimenting with using Cloud infrastructures so we thought this would be a great opportunity in providing cloud-based delivery solutions. In this case study, I cover a project we are currently working on for a large organization. It is a new Java-based web services project so we’ve been able to implement solutions using our recommended software delivery patterns rather than being constrained by legacy tools or decisions. However, as I note, we aren’t without constraints on this project. If I were you, I’d call “BS!” on any “case study” in which everything went flawlessly and assume it was an extremely small or a theoretical project in the author’s mind. This is the real deal. Enough said, on to the case study.      

AWS Tools

Fast Facts

Industry: Healthcare, Public Sector
Profile: The customer is making available to all, free of charge, a series of software specifications and open source software modules that together make up an oncology-extended Electronic Health Record capability.
Key Business Issues: The customer was seeking that all team members are provided “unencumbered” access to infrastructure resources without the usual “request and wait” queued-based procedures present in most organizations
Stakeholders: Over 100 people consisting of Developers, Testers, Analysts, Architects, and Project Management.
Solution: Continuous Delivery Operations in the Cloud
Key Tools/Technologies: Amazon Web Services  - AWS (Elastic Computer Cloud (EC2), (Simple Storage Service (S3), Elastic Block Storage (EBS), etc.), Jenkins, JIRA Studio, Ant, Ivy, Tomcat and PostgreSQL

The Business Problem
The customer was used to dealing with long drawn-out processes with Operations teams that lacked agility. They were accustomed to submitting Word documents via email to an Operations teams, attending multiple meetings and getting their environments setup weeks or months later. We were compelled to develop a solution that reduced or eliminated these problems that are all too common in many large organizations (Note: each problem is identified as a letter and number, for example: P1, and referred to later):


  1. Unable to deliver software to users on demand (P1)
  2. Queued requests for provisioned instances (P2)
  3. Unable to reprovision precise target environment configuration on demand (P3)
  4. Unable to provision instances on demand (P4)
  5. Configuration errors in target environments presenting deployment bottlenecks while Operations and Development teams troubleshoot errors (P5)
  6. Underutilized instances (P6)
  7. No visibility into purpose of instance (P7)
  8. No visibility into the costs of instance (P8)
  9. Users cannot terminate instances (P9)
  10. Increased Systems Operations personnel costs (P10)


Our Team
We put together a four-person team to create a solution for delivering software and managing the internal Systems Operations for this 100+ person project. We also hired a part-time Security expert. The team consists of two Systems Engineers and two Software Engineers focused on Continuous Delivery and the Cloud. One of the Software Engineers is the Solutions Architect/PM for our team.

Our Solution
We began with the end in mind based on the customer’s desire for unencumbered access to resources. To us, “unencumbered” did not mean without controls; it meant providing automated services over queued “request and wait for the Ops guy to fulfill the request” processes. Our approach is that every resource is in the cloud: Software as a Service (SaaS), Platform as a Service (PaaS) or Infrastructure as a Service (IaaS) to reduce operations costs (P10) and increase efficiency. In doing this, effectively all project resources are available on demand in the cloud. We have also automated the software delivery process to Development and Test environments and working on the process of one-click delivery to production. I’ve identified the problem we’re solving – from above – in parentheses (P1, P8, etc.). The solution includes:

  • On-Demand Provisioning – All hardware is provided via EC2’s virtual instances in the cloud, on demand (P2). We’ve developed a “Provisioner” (PaaS) that provides any authorized team member the capability to click a button and get their project-specific target environment (P3) in the AWS’ cloud – thus, providing unencumbered access to hardware resources. (P4) The Provisioner provides all authorized team members the capability to monitor instance usage (P6) and adjust accordingly. Users can terminate their own virtual instances (P9).
  • Continuous Delivery Solution so that the team can deliver software to users on demand (P1):
    • Automated build script using Ant – used to drive most of the other automation tools
    • Dependency Management using Ivy. We will be adding Sonatype Nexus
    • Database Integration/Change using Ant and Liquibase
    • Automated Static Analysis using Sonar (with CheckStyle, FindBugs, JDepend, and Cobertura)
    • Test framework hooks for running JUnit, etc.
    • Reusing remote Deployment custom Ant scripts that use Java Secure Channel and Web container configuration. However, we will be starting a process of using a more robust tool such as ControlTier to perform deployment
    • Automated document generation using Grand, SchemaSpy (ERDs) and UMLGraph
    • Continuous Integration server using Hudson
    • Continuous Delivery pipeline system – we are customizing Hudson to emulate a Deployment Pipeline
  • Issue Tracking – We’re using the JIRA Studio SaaS product from Atlassian (P10), which provides issue tracking, version-control repository, online code review and a Wiki. We also manage the relationship with the vendor and perform the user administration including workflow management and reporting.
  • Development Infrastructure - There were numerous tools selected by the customer for Requirements Management and Test Management and Execution including HP QC, LoadRunner, SoapUI, Jama Contour. Many of these tools were installed and managed by our team onto the EC2 instances
  • Instance Management - Any authorized team member is able to monitor virtual instance usage by viewing a web-based dashboard (P6, P7, P8) we developed. This helps to determine instances that should no longer be in use or may be eating up too much money. There is a policy that test instances (e.g. Sprint Testing) are terminated no less than every two weeks. This promotes ephemeral environments and test automation.
  • Deployment to Production – Much of the pre-production infrastructure is in place, but we will be adding some additional automation features to make it available to users in production (P1). The deployment sites are unique in that we aren’t hosting a single instance used by all users and it’s likely the software will be installed at each site. One plan is to deploy separate instances to the cloud or to virtual instances that are shipped to the user centers

    System Monitoring and Disaster Recovery – Using CloudKick to notify us of instance errors or anomalies. EC2 provides us with some monitoring as well. We will be implementing a more robust monitoring solution using Nagios or something similar in the coming months. Through automation and supporting process, we’ve implemented a disaster recovery solution.

Benefits
The benefits are primarily around removing the common bottlenecks from processes so that software can be delivered to users and team members more often. Also, we think our approach to providing on-demand services over queued-based requests increases agility and significantly reduces costs. Here are some of the benefits:

  • Deliver software more often – to users and internally (testers, managers, demos)
  • Deliver software more quickly – since the software delivery process is automated, we identify the SVN tag and click a button to deliver the software to any environment
  • Software delivery is rapid, reliable and repeatable. All resources can be reproduced with a single click – source code, configuration, environment configuration, database and network configuration is all checked in and versioned and part of a single delivery system.
  • Increased visibility to environments and other resources – All preconfigured virtual hardware instances are available for any project member to provision without needing to submit forms or attend countless meetings

Tools
Here are some of the tools we are using to deliver this solution. Some of the tools were chosen by our team exclusively and some by other stakeholders on the project.

  • AWS EC2 - Cloud-based virtual hardware instances
  • AWS S3 – Cloud-based storage. We use S3 to store temporary software binaries and backups
  • AWS EBS – Elastic Block Storage. We use EBS to attach PostgreSQL data volumes
  • Ant – Build Automation
  • CloudKick – Real-time Cloud instance monitoring
  • ControlTier – Deployment Automation. Not implemented yet.
  • HP LoadRunner – Load Testing
  • HP Quality Center (QC) – Test Management and Orchestration
  • Ivy – Dependency Management
  • Jama Contor - Requirements Management
  • Jenkins – Continuous Integration Server
  • JIRA Studio - Issue Tracking, Code Review, Version-Control, Wiki
  • JUnit – Unit and Component Testing
  • Liquibase – Automated database change management
  • Nagios – or Zenoss. Not implemented yet
  • Nexus – Dependency Management Repository Manager (not implemented yet)
  • PostgreSQL – Database used by Development team. We’ve written script that automate database change management
  • Provisioner (Custom Web-based) – Target Environment Provisioning and Virtual Instance Monitoring
  • Puppet – Systems Configuration Management
  • QTP – Test Automation
  • SoapUI – Web Services Test Automation
  • Sonar – code quality analysis (Includes CheckStyle, PMD, Cobertura, etc.)
  • Tomcat/JBoss – Web container used by Development. We’ve written script to automate the deployment and container configuration

Solutions we’re in the process of Implementing
We’re less than a year into the project and have much more work to do. Here are a few projects we’re in the process or will be starting to implement soon:

  • System Configuration Management – We’ve started using Puppet, but we are expanding how it’s being used in the future
  • Deployment Automation – The move to a more robust Deployment automation tool such as ControlTier
  • Development Infrastructure Automation – Automating the provisioning and configuration of tools such as HP QC in a cloud environment. etc.

What we would do Differently
Typically, if we were start a Java-based project and recommend tools around testing, we might choose the following tools for testing, requirements and test management based on the particular need:

  • Selenium with SauceLabs
  • JIRA Studio for Test Management
  • JIRA Studio for Requirements Management
  • JMeter – or other open source tool – for Load Testing

However, like most projects there are many stakeholders who have their preferred approach and tools they are familiar in using, the same way our team does. Overall, we are pleased with how things are going so far and the customer is happy with the infrastructure and approach that is in place at this time. I could probably do another case study on dealing with multiple SaaS vendors, but I will leave that for another post.

Summary
There’s much more I could have written about what we’re doing, but I hope this gives you a decent perspective of how we’ve implemented a DevOps philosophy with Continuous Delivery and the Cloud and how this has led our customer to more a service-based, unencumbered and agile environment.