05-02-2014

List ALL the availability zones!

Have you ever tried to google for a list of AWS Regions or Availability Zones? I do it all the time. I need to figure out which AZs to my ELB should go across, or want to try my new CloudFormation template in a different region to make sure I didn’t hardcode something I shouldn’t have. Of course, I usually blank out on how many zones each region has, and I always get my directions screwed up so I can never remember if Ireland is eu-west or eu-east (and while it is embarrassing to admit, I had no idea where São Paulo was before I started working with the AWS platform).

However, every time I google for a list of AZs, I remember that none of the AWS documentation gives a nice, easy list of all the regions and their availability zones. AWS does add new zones and regions pretty regularly, so a list might go out of date after a few months. Still, I end up looking for these every few days, so I wrote them all out, and then put it on the internet, so hopefully next time I google for “list of availability zones” this page pops up. Maybe it’ll help you too?

Virginia — US East 1
• us-east-1a
• us-east-1b
• us-east-1c
• us-east-1d
• us-east-1e

California — US West 1
• us-west-1a
• us-west-1b
• us-west-1c

Oregon — US West 2
• us-west-2a
• us-west-2b
• us-west-2c

Ireland — EU West 1
• eu-west-1a
• eu-west-1b
• eu-west-1c

Singapore — AP Southeast 1
• ap-southeast-1a
• ap-southeast-1b

Tokyo — AP Northeast 1
• ap-northeast-1a
• ap-northeast-1b
• ap-northeast-1c

Sydney — AP Southeast 2
• ap-southeast-2a
• ap-southeast-2b

São Paulo — SA East 1
• sa-east-1a
• sa-east-1b

03-02-2014

Creating a Secure Deployment Pipeline in Amazon Web Services

Many organizations require a secure infrastructure. I’ve yet to meet a customer that says that security isn’t a concern. But, the decision on “how secure?” should be closely associated with a risk analysis for your organization.

Since Amazon Web Services (AWS) is often referred to as a “public cloud”, people sometimes infer that “public” must mean it’s “out in the public” for all to see. I’ve always seen “public/private clouds” as an unfortunate use of terms. In this context, public means more like “Public Utility”. People often interpret “private clouds” to be inherently more secure. Assuming that “public cloud” = less secure and “private cloud” = more secure couldn’t be further from the truth. Like most things, it’s all about how you architect your infrastructure. While you can define your infrastructure to have open access, AWS provides many tools to create a truly secure infrastructure while eliminating access to all but only authorized users.

I’ve created an initial list of many of the practices we use. We don’t employ all these practices in all situations, as it often depends on our customers’ particular security requirements. But, if someone asked me “How do I create a secure AWS infrastructure using a Deployment Pipeline?”, I’d offer some of these practices in the solution. I’ll be expanding these over the next few weeks, but I want to start with some of our practices.

AWS Security

* After initial AWS account creation and login, configure IAM so that there’s no need to use the AWS root account
* Apply least privilege to all IAM accounts. Be very careful about who gets Administrator access.
* Enable all IAM password rules
* Enable MFA for all users
* Secure all data at rest
* Secure all data in transit
* Put all AWS resources in a Virtual Private Cloud (VPC).
* No EC2 Key Pairs should be shared with others. Same goes for Access Keys.
* Only open required ports to the Internet. For example, with the exception of, say, port 80, no security groups should have a CIDR Source of 0.0.0.0/0). The bastion host might have access to port 22 (SSH), but you should enable CIDR to limit access to specific subnets. Using a VPC is a part of a solution to eliminate Internet access. No canonical environments should have SSH/RDP access.
* Use IAM to limit access to specific AWS resources and/or remove/limit AWS console access
* Apply a bastion host configuration to reduce your attack profile
* Use IAM Roles so that there’s no need to configure Access Keys on the instances
* Use resource-level permissions in EC2 and RDS
* Use SSE to secure objects in S3 buckets
* Share initial IAM credentials with others through a secure mechanism (e.g. AES-256 encryption)
* Use and monitor AWS CloudTrail logs

Deployment Pipeline

A deployment pipeline is a staged process in which the complete software system is built and tested with every change. Team members receive feedback as it completes each stage. With most customers, we usually construct between 4-7 deployment pipeline stages and the pipeline only goes to the next stage if the previous stages were successful. If a stage fails, the whole pipeline instance fails. The first stage (often referred to as the “Commit Stage”) will usually take no more than 10 minutes to complete. Other stages may take longer than this. Most stages require no human intervention as the software system goes through more extensive testing on its way to production. With a deployment pipeline, software systems can be released at any time the business chooses to do so. Here are some of the security-based practices we employ in constructing a deployment pipeline.

* Automate everything: Networking (VPC, Route 53) Compute (EC2), Storage, etc. All AWS automation should be defined in CloudFormation. All environment configuration should be defined using infrastructure automation scripts – such as Chef, Puppet, etc.
* Version Everything: Application Code, Configuration, Infrastructure and Data
* Manage your binary dependencies. Be specific about binary version numbers. Ensure you have control over these binaries.
* Lockdown pipeline environments. Do not allow SSH/RDP access to any environment in the deployment pipeline
* For project that require it, use permissions on the CI server or Deployment application to limit who can run deployments in certain environments – such as QA, Pre-Production and Production. When you have a policy in which all changes are applied through automation and environments are locked down, this usually becomes less of a concern. But, it can still be a requirements on some teams.
* Use the Disposable Environments pattern – instances are terminated once every few days. This approach reduces the attack profile
* Log everything outside of the EC2 instances (so that they can be access later). Ensure these log files are encrypted e.g. securely through S3)
* All canonical changes are only applied through automation that are part of the deployment pipeline. This includes application, configuration, infrastructure and data change. Infrastructure patch management would be a part of the pipeline just like any outer software system change.
* No one has access to nor can make direct changes to pipeline environments
* Create high-availability systems Multi-AZ, Auto Scaling, Elastic Load Balancing and Route 53
* For non-Admin AWS users, only provide access to AWS through a secure Continuous Integration (CI) server or a self-service application
* Use Self-Service Deployments and give developers full SSH/RDP access to their self-service deployment. Only their particular EC2 Key Pair can access the instance(s) associated with the deployment. Self-Service Deployments can be defined in the CI server or a lightweight self-service application.
* Provide capability for any authorized user to perform a self-service deployment with full SSH/RDP access to the environment they created (while eliminating outside access)
* Run two active environments – We’ve yet to do this for customers, but if you want to eliminate all access to the canonical production environment, you might choose to run two active environments at once so that engineers can access the non-production environment to troubleshoot a problem in which the environment has the exact same configuration and data so you’re troubleshooting accurately.
* Run automated infrastructure tests to test for security vulnerabilities (e.g. cross-site scripting, SQL injections, etc.) with every change committed to the version-control repository as part of the deployment pipeline.

FAQ

* What is a canonical environment? It’s your system of record. You want your canonical environment to be solely defined in source code and versioned. If someone makes a change to the canonical system and it affects everyone it should only be done through automation. While you can use a self-service deployment to get a copy of the canonical system, any direct change you make to the environment is isolated and never made part of the canonical system unless code is committed to the version-control repository.
* How can I troubleshoot if I cannot directly access canonical environments? Using a self-service deployment, you can usually determine the cause of the problem. If it’s a data-specific problem, you might import a copy of the production database. If this isn’t possible for time or security reasons, you might run multiple versions of the application at once.
* Why should we dispose of environments regularly? Two primary reasons. The first is to reduce your attack profile (i.e. if environments always go up and down, it’s more difficult to hone in on specific resources. The second reason is that it ensures that all team members are used to applying all canonical changes through automation and not relying on environments to always be up and running somewhere.
* Why should we lockdown environments? To prevent people from making disruptive environment changes that don’t go through the version-control repository.

03-15-2013

Getting to know the Chaos Monkey

Moving your infrastructure to the cloud changes the way you think about a lot of things. With the attitude of abundance that comes with having unlimited instances at your command, you can do all sorts of cool things that would be prohibitive with actual hardware: elastic scaling of infrastructure, transient environments, blue/green deployments, etc. Some things that were just plain bad ideas with real servers have become best practices in the cloud – like just randomly turning off your production servers to see what happens.

Chaos Monkey

One of the major concepts of working in the cloud is the idea of “designing for failure.” It’s mentioned in AWS’s Cloud Best Practices, and myriad different blog entries. The main idea behind designing for failure is accepting that things are going to go wrong, and making sure your infrastructure is setup to handle that. But it’s one thing to say that your infrastructure is resilient; it’s quite another to prove it by running tools that’s sole purpose is to tear your infrastructure apart.

There are a bunch of different tools out there that do this (including Stelligent’s Havoc), probably the best known is Netflix’s Chaos Monkey. It’s available for free and is open source. On the downside, it’s not the easiest tool to get going, but hopefully this post can alleviate some of that.

Chaos Monkey is free-to-use and open source, and available on Netflix’s Simian Army GitHub page. Once targeted at an Auto Scaling Group (ASG), Chaos Monkey will randomly delete EC2 instances, challenging your application to recover. Chaos Monkey is initially configured to only operate during business hours, letting you see how resilient your architecture is in controlled conditions, when you’re in the office; as opposed to seeing it happen in the wild, when you’re asleep in bed.

The Chaos Monkey quick start guide shows you how to set up Launch Configs, Auto Scaling Groups, and Simple DB domains using the AWS CLI tools. Depending on your amount of patience and free time, you might be able to make it through those. However, Netflix has another tool, Asgard, which makes setting up all those things a cinch, and [we have a blog post that makes setting up Asgard a cinch], so for the purposes of this explanation, we’re going to assume you’re using Asgard.

As Chaos Monkey will be going in and killing EC2 instances, we highly recommend working with it in a contained environment until you figure out how you’d like to leverage it in your organization. So it’s best to at least set up a new Auto Scaling group, but ideally use an account that you’re not hosting your production instances with, at first.

The first thing you need to do once you have Asgard set up is define an Application for it to use. Select the Apps menu and choose Create New Application. Create a new Standalone Application called MonkeyApp, and enter your name and email address and click Create New Application.

With your new application set up, you’ll need to create an auto-scaling group by going to the Cluster Menu and selecting Auto Scaling Groups, and then hitting the Create New Auto Scaling Group button. Select monkeyapp from the application dropdown, then enter 3 for all the instance counts fields (desired, min, max). The defaults are fine for everything else, so click Create New Autoscaling Group at the bottom of the page.

Once the auto-scaling group is running, you’ll see it spin up EC2 instances to match your ASG sizing. If you were to terminate these instances manually, within a few minutes, another instance would spin up in its place. In this way, you can be your own Chaos Monkey, inflicting targeted strikes against your application’s infrastructure.

Feel free to go give that a shot. Of course, why do anything yourself if you can just make the computer to do that for you?

To set up Chaos Monkey, the first thing you’ll need to do is set up an Amazon Simple DB domain for Chaos Monkey to use. In Asgard, it’s a cinch: just go to SDB and hit Create New SimpleDB Domain. Call it SIMIAN_ARMY and hit the Create button.

Now comes the finicky part of setting up Chaos Monkey on an EC2 instance. Chaos Monkey has a history of not playing well with OpenJDK, and overall getting it installed is more of an exercise in server administration than applying cloud concepts, so we’ve provided a CloudFormation template which will fast forward you to the point where you can just play around.

Once you have Chaos Monkey installed, you’ll need to make a few changes to the configuration to make it work:

vi src/main/resources/client.properties

Enter your AWS account and secret keys, as well as change the AWS region if necessary.

vi src/main/resources/simianarmy.properties

Uncomment the isMonkeyTime key, and set to true. This setting restricts running Chaos Monkey during business hours and when you’re playing around with Chaos Monkey, it may not be during business hours.

vi src/main/resources/chaos.properties

set simianarmy.chaos.leashed=false
set simianarmy.chaos.ASG.enabled=true
simianarmy.chaos.ASG.maxTerminationsPerDay = 100
set simianarmy.chaos.ASG.<monkey-target>.enabled=true
set simianarmy.chaos.ASG.<monkey-target>.probability=6.0

(Replacing <monkey-target> with the name of your auto-scaling group, likely monkeyapp if you’ve been following the directions outlined above.) This is the fun part of the Chaos Monkey config. It unleashes the Chaos Monkey (otherwise it would just say that it thought about taking down an instance, instead of actually doing it). The probability is the daily probability that it’ll kill an instance — 1.0 means an instance will definitely be killed at some point today; 6.0 means that an instance will be killed on the first run. And let’s knock up the max number of terminations per day so we can see Chaos Monkey going nuts.

It’s also probably a good idea to turn off Janitor and VolumeTaggingMonkey, since they’ll just clutter up the logs with them saying they’re not doing anything at all.

vi src/main/resources/janitor.properties
vi src/main/resources/volumeTagging.properties

and set simianarmy.janitor.enabled and simianarmy.volumeTagging.enabled to false in the respective files.

One you’ve configured everything, the following command will kick off the SimianArmy application:

./gradlew jettyRun

After bootstrapping itself, it should identify your auto-scaling group, pick a random instance in it, and terminate it. Your auto-scaling group will respond by spinning up a new instance.

But then what? Did your customers lose all the data on the form they just filled out, or were they sent over to another instance? Did their streaming video cut out entirely, or did quality just degrade momentarily? Did your application respond to the outage seamlessly, or was your customer impacted?

These are the issues that Chaos Monkey will show you are occurring, and you can identify where you haven’t been designing for failure.

(NOTE: When you’re all done playing around with Chaos Monkey, you’ll need to change your monkeyapp Auto Scaling Group instance counts to 0, otherwise AWS will keep those instances up, which could result in higher usage fees than you’re used to seeing. In Asgard, select Cluster > Auto Scaling Groups > monkeyapp > Edit and set all instance counts to zero, and AWS will terminate your test instances. If you’d like to come back later and play around, you can just shut down your Chaos Monkey and Asgard instances and turn them back on when you’re ready; otherwise you can just delete the CloudFormation stacks entirely and that’ll clean up everything for you.)

A quick look at Netflix’s Asgard

Netflix OSS

Asgard is an open source application from Netflix that makes it easier to work with Amazon Web Services. It offers lots of functionality that isn’t accessible via AWS’s normal web interface: for several AWS features, Asgard removes a lot of the cryptic command line tools required. It’s provided free of charge by Netflix, and the source is available on GitHub.

Asgard is a great tool to use when you’re beginning to understand certain concepts in AWS. A lot of the more powerful features available working in the cloud are handled by CLI tools, APIs or CloudFormation. These are great if you’re developing applications to take advantage of the platform, but can make understanding the concepts difficult when you’re exploring them for the first time.

For example, one of the biggest advantages of working in the cloud is you can automatically scale your hardware to match usage. AWS does this with Launch Configurations and Auto Scaling Groups, a powerful feature that is a pain to implement: getting auto-scaling groups set up correctly involves all sorts of CLI tools or calling into APIs. With Asgard (and with our tool, Stelligent Havoc), it’s a couple clicks and a few fields to fill out.

Another painful AWS feature is SimpleDB, which is needed to run another Netflix Open Source Tool, Chaos Monkey (which is the focus of another blog entry). Setting up a Simple DB via the AWS CLI tools requires some tricky scripting and a bit of luck. With Asgard, you just have to punch in the name and hit create.

Asgard has other benefits besides just making complicated AWS functions easier. If you have an organization where multiple developers need access to work in your AWS account, but you want to keep your access and private keys under wraps, you can enter them into Asgard and give your developers access there, empowering them but mitigating the security risks.

Asgard can also provides a way to have logs about what changes are being made to your AWS environment. When using the CLI tools or API, it’s on you to make sure that all activities are being appropriately documented. With Asgard, you get all that for free.

Also, if you’d like to take advantage of the hidden access keys or logging without have to operate through the web interface, Asgard also offers a REST API that you can use to use it programmatically.

Asgard is available as a self-contained war — you can download it and have it running on your machine in a few minutes. Also available is a war you can place in your already running servlet container. There are a couple of gotchas you should watch out for, so make sure you read the troubleshooting page.

Or if you are the instant-gratification type, we’ve developed a CloudFormation template you can use which will set up Asgard in AWS for you in ten minutes, and all you have to do is enter a few simple parameters. Our template uses Chef scripts to setup Tomcat and then install the Asgard web application. The scripts are open source and available on Stelligent’s github page.  All you have to do is enter a username and password for your Asgard installation, and the name key pair to set up the EC2 instance. (You just need the key pair name, you can leave out the .pem.) If you’ve never set up a keypair, check out these directions on the AWS Documentation site.

After you get Asgard running, you’ll be prompted for your access key and secret key (which you can find on your AWS Security Credentials page) and then it’ll take a few minutes to start up. If it takes a long time, check the Asgard troubleshooting page.

Once the setup is complete, you’ll be able to start taking advantage of the powerful AWS features, without all the hassle of dealing with the command line.

10-08-2012

Continuous Delivery in the Cloud: Infrastructure Automation (Part 6 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application. In part 2, I went over how we use this CD pipeline to deliver software from checkin to production. In part 3, we focused on how CloudFormation is used to script the virtual AWS components that create the Manatee infrastructure. Then in part 4, we focused on a “property file less” environment by dynamically setting and retrieving properties. Part 5 explained how we use Capistrano for scripting our deployment. A list of topics for each of the articles is summarized below:

Part 1: Introduction – Introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – In-depth look at the CD Pipeline;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – What you’re reading now;

In this part of the series, I am going to show how we use Puppet in combination with CloudFormation to script our target environment infrastructure, preparing it for a Manatee application deployment.

What is Puppet?

Puppet is a Ruby based infrastructure automation tool. Puppet is primarily used for provisioning environments and managing configuration. Puppet is made to support multiple operating systems, making your infrastructure automation cross-platform.

How does Puppet work?

Puppet uses a library called Facter which collects facts about your system. Facter returns details such as the operating system, architecture, IP address, etc. Puppet uses these facts to make decisions for provisioning your environment. Below is an example of the facts returned by Facter.

# Facter
architecture => i386
...
ipaddress => 172.16.182.129
is_virtual => true
kernel => Linux
kernelmajversion => 2.6
...
operatingsystem => CentOS
operatingsystemrelease => 5.5
physicalprocessorcount => 0
processor0 => Intel(R) Core(TM)2 Duo CPU     P8800  @ 2.66GHz
processorcount => 1
productname => VMware Virtual Platform

Puppet uses the operating system fact to decide the service name as show below:


case $operatingsystem {
  centos, redhat: {
    $service_name = 'ntpd'
    $conf_file    = 'ntp.conf.el'
  }
}

With this case statement, if the operating environment is either centos or redhat the service name ntpd and the configuration file ntp.conf.el are used.

Puppet is declarative by nature. Inside a Puppet module you define the end state the environment end state after the Puppet run. Puppet enforces this state during the run. If at any point the environment does not conform to the desired state, the Puppet run fails.

Anatomy of a Puppet Module

To script the infrastructure Puppet uses modules for organizing related code to perform a specific task. A Puppet module has multiple sub directories that contain resources for performing the intended task. Below are these resources:

manifests/: Contains the manifest class files for defining how to perform the intended task
files/: Contains static files that the node can download during the installation
lib/: Contains plugins
templates/: Contains templates which can be used by the module’s manifests
tests/: Contains tests for the module

Puppet also uses manifests to manage multiple modules together site.pp. Puppet also uses another manifest to define what to install on each node, default.pp.

How to run Puppet

Puppet can be run using either a master agent configuration or a solo installation (puppet apply).

Master Agent: With a master agent installation, you configure one main master puppet node which manages and configure all of your agent nodes (target environments). The master initiates the installation of the agent and manages it throughout its lifecycle. This model enables infrastructure changes to your agents in parallel by controlling the master node.

Solo: In a solo Puppet run, it’s up to the user to place the desired Puppet module on the target environment. Once the module is on the target environment, the user needs run puppet apply --modulepath=/path/to/modules/ /path/to/site.pp. Puppet will then provision the server with the provided modules and site.pp without relying on another node.

Why do we use Puppet?

We use Puppet to script and automate our infrastructure — making our environment provisioning repeatable, fully automated, and less error prone. Furthermore, scripting our environments gives us complete control over our infrastructure and the ability to terminate and recreate environments as often as they choose.

Puppet for Manatees

In the Manatee infrastructure, we use Puppet for provisioning our target environments. I am going to go through our manifests and modules while explaining their use and purpose. In our Manatee infrastructure, we create a new target environment as part of the CD pipeline – discussed in part 2 of the series, CD Pipeline. Below I provide a high-level summary of the environment provisioning process:

1. CloudFormation dynamically creates a params.pp manifest with AWS variables
2. CloudFormation runs puppet apply as part of UserData
3. Puppet runs the modules defined in hosts/default.pp.
4. Cucumber acceptance tests are run to verify the infrastructure was provisioned correctly.

Now that we know at a high-level what’s being done during the environment provisioning, let’s take a deeper look at the scripts in more detail. The actual scripts can be found here: Puppet

First we will start off with the manifests.

The site.pp (shown below) serves two purposes. It loads the other manifests default.pp, params.pp and also sets stages pre, main and post.


import "hosts/*"
import "classes/*"

stage { [pre, post]: }
Stage[pre] -> Stage[main] -> Stage[post]

These stages are used to define the order in which Puppet modules should be run. If the Puppet module is defined as pre,it will run before Puppet modules defined as main or post. Moreover if stages aren’t defined, Puppet will determine the order of execution. The default.pp (referenced below) shows how staging defined for executing puppet modules.


node default {
  class { "params": stage => pre }
  class { "java": stage => pre }
  class { "system": stage => pre }
  class { "tomcat6": stage => main }
  class { "postgresql": stage => main }
  class { "subversion": stage => main }
  class { "httpd": stage => main }
  class { "groovy": stage => main }
}

The default.pp manifest also defines which Puppet modules to use for provisioning the target environment.

params.pp (shown below), loaded from site.pp, is dynamically created using CloudFormation. params.pp is used for setting AWS property values that are used later in the Puppet modules.


class params {
  $s3_bucket = ''
  $application_name = ''
  $hosted_zone = ''
  $access_key = ''
  $secret_access_key = ''
  $jenkins_internal_ip = ''
}

Now that we have an overview of the manifests used, lets take a look at the Puppet modules themselves.

In our java module, which is run in the pre stage, we are running a simple installation using packages. This is easily dealt with in Puppet by using the package resource. This relies on Puppet’s knowledge of the operating system and the package manager. Puppet simply installs the package that is declared.


class java {
  package { "java-1.6.0-openjdk": ensure => "installed" }
}

The next module we’ll discuss is system. System is also run during the pre stage and is used for the setup of all the extra operations that don’t necessarily need their own module. These actions include setting up general packages (gcc, make, etc.), installing ruby gems (AWS sdk, bundler, etc.), and downloading custom scripts used on the target environment.


class system {

  include params

  $access_key = $params::access_key
  $secret_access_key = $params::secret_access_key

  Exec { path => '/usr/bin:/bin:/usr/sbin:/sbin' }

  package { "gcc": ensure => "installed" }
  package { "mod_proxy_html": ensure => "installed" }
  package { "perl": ensure => "installed" }
  package { "libxslt-devel": ensure => "installed" }
  package { "libxml2-devel": ensure => "installed" }
  package { "make": ensure => "installed" }

  package {"bundler":
    ensure => "1.1.4",
    provider => gem
  }

  package {"trollop":
    ensure => "2.0",
    provider => gem
  }

  package {"aws-sdk":
    ensure => "1.5.6 ",
    provider => gem,
    require => [
      Package["gcc"],
      Package["make"]
    ]
  }

  file { "/home/ec2-user/aws.config":
    content => template("system/aws.config.erb"),
    owner => 'ec2-user',
    group => 'ec2-user',
    mode => '500',
  }

  define download_file($site="",$cwd="",$creates=""){
    exec { $name:
      command => "wget ${site}/${name}",
      cwd => $cwd,
      creates => "${cwd}/${name}"
    }
  }

  download_file {"database_update.rb":
    site => "https://s3.amazonaws.com/sea2shore",
    cwd => "/home/ec2-user",
    creates => "/home/ec2-user/database_update.rb",
  }

  download_file {"id_rsa.pub":
    site => "https://s3.amazonaws.com/sea2shore/private",
    cwd => "/tmp",
    creates => "/tmp/id_rsa.pub"
  }

  exec {"authorized_keys":
    command => "cat /tmp/id_rsa.pub >> /home/ec2-user/.ssh/authorized_keys",
    require => Download_file["id_rsa.pub"]
    }
  }

First I want to point out that at the top we are specifying to include params. This enables the system module to access the params.pp file. This way we can use the properties defined in params.pp.


include params

$access_key = $params::access_key
$secret_access_key = $params::secret_access_key

This enables us to define the parameters in one central location and then reference that location with other module.

As we move through the script we are using the package resource similar to previous modules. For each rubygem we use the package resource and explicitly tell Puppet to use the gem provider. You can specify other providers like rpm and yum.

We use the file resource to create files from templates.


AWS.config(
  :access_key_id => "<%= "#{access_key}" %>",
  :secret_access_key => "<%= "#{secret_access_key}" %>"
)

In the aws.config.erb template (referenced above) we are using the properties defined in params.pp for dynamically creating an aws.config credential file. This file is then used by our database_update.rb script for connecting to S3.

Speaking of the database_update.rb script, we need to get it on the target environment. To do this, we define a download_file resource.


define download_file($site="",$cwd="",$creates=""){
  exec { $name:
    command => "wget ${site}/${name}",
    cwd => $cwd,
    creates => "${cwd}/${name}"
  }
}

This creates a new resource for Puppet to use. Using this we are able to download both the database_update.rb and id_rsa.pub public SSH key.

As a final step for setting up the system, we execute a bash line for copying the id_rsa.pub contents into the authorized_keys file for the ec2-user. This enables clients with the connected id_rsa key to ssh into the target environment as ec2-user.

The Manatee infrastructure uses Apache for the webserver, Tomcat for the app server, and PostgreSQL for its database. Puppet these up as part of the main stage, meaning they run in order after the pre stage modules are run.

In our httpd module, we are performing several steps discussed previously. The httpd package is installed and creating a new file from a template.


class httpd {
  include params

  $application_name = $params::application_name
  $hosted_zone = $params::hosted_zone

  package { 'httpd':
    ensure => installed,
  }

  file { "/etc/httpd/conf/httpd.conf":
    content => template("httpd/httpd.conf.erb"),
    require => Package["httpd"],
    owner => 'ec2-user',
    group => 'ec2-user',
    mode => '664',
  }

  service { 'httpd':
    ensure => running,
    enable => true,
    require => [
      Package["httpd"],
      File["/etc/httpd/conf/httpd.conf"]],
      subscribe => Package['httpd'],
    }
  }

The new piece of functionality used in our httpd module is service. service allows us define the state the httpd service should be in at the end of our run. In this case, we are declaring that it should be running.

The Tomcat module again uses package to define what to install and service to declare the end state of the tomcat service.


class tomcat6 {

  Exec { path => '/usr/bin:/bin:/usr/sbin:/sbin' }

  package { "tomcat6":
    ensure => "installed"
  }

  $backup_directories = [
    "/usr/share/tomcat6/.sarvatix/",
    "/usr/share/tomcat6/.sarvatix/manatees/",
    "/usr/share/tomcat6/.sarvatix/manatees/wildtracks/",
    "/usr/share/tomcat6/.sarvatix/manatees/wildtracks/database_backups/",
    "/usr/share/tomcat6/.sarvatix/manatees/wildtracks/database_backups/backup_archive",
  ]

  file { $backup_directories:
    ensure => "directory",
    owner => "tomcat",
    group => "tomcat",
    mode => 777,
    require => Package["tomcat6"],
  }

  service { "tomcat6":
    enable => true,
    require => [
      File[$backup_directories],
      Package["tomcat6"]],
    ensure => running,
  }
}

Tomcat uses the file resource differently then previous modules. tomcat uses file for creating directories. This is defined using ensure => “directory”.

We are using the package resource for installing PostgreSQL, building files from templates using the file resource, performing bash executions with exec, and declaring the intended state of the PostgreSQL using the service resource.


class postgresql {

  include params

  $jenkins_internal_ip = $params::jenkins_internal_ip

  Exec { path => '/usr/bin:/bin:/usr/sbin:/sbin' }

  define download_file($site="",$cwd="",$creates=""){
    exec { $name:
      command => "wget ${site}/${name}",
      cwd => $cwd,
      creates => "${cwd}/${name}"
    }
  }

  download_file {"wildtracks.sql":
    site => "https://s3.amazonaws.com/sea2shore",
    cwd => "/tmp",
    creates => "/tmp/wildtracks.sql"
  }

  download_file {"createDbAndOwner.sql":
    site => "https://s3.amazonaws.com/sea2shore",
    cwd => "/tmp",
    creates => "/tmp/createDbAndOwner.sql"
  }

  package { "postgresql8-server":
    ensure => installed,
  }

  exec { "initdb":
    command => "service postgresql initdb",
    require => Package["postgresql8-server"]
  }

  file { "/var/lib/pgsql/data/pg_hba.conf":
    content => template("postgresql/pg_hba.conf.erb"),
    require => Exec["initdb"],
    owner => 'postgres',
    group => 'postgres',
    mode => '600',
  }

  file { "/var/lib/pgsql/data/postgresql.conf":
    content => template("postgresql/postgresql.conf.erb"),
    require => Exec["initdb"],
    owner => 'postgres',
    group => 'postgres',
    mode => '600',
  }

  service { "postgresql":
    enable => true,
    require => [
      Exec["initdb"],
      File["/var/lib/pgsql/data/postgresql.conf"],
      File["/var/lib/pgsql/data/pg_hba.conf"]],
    ensure => running,
  }

  exec { "create-user":
    command => "echo CREATE USER root | psql -U postgres",
    require => Service["postgresql"]
  }

  exec { "create-db-owner":
    require => [
      Download_file["createDbAndOwner.sql"],
      Exec["create-user"],
      Service["postgresql"]],
    command => "psql < /tmp/createDbAndOwner.sql -U postgres"
  }

  exec { "load-database":
    require => [
      Download_file["wildtracks.sql"],
      Exec["create-user"],
      Service["postgresql"],
      Exec["create-db-owner"]],
    command => "psql -U manatee_user -d manatees_wildtrack -f /tmp/wildtracks.sql"
  }
}

In this module we are creating a new user on the PostgreSQL database:


exec { "create-user":
  command => "echo CREATE USER root | psql -U postgres",
  require => Service["postgresql"]
}

In this next section we download the latest Manatee database SQL dump.


download_file {"wildtracks.sql":
  site => "https://s3.amazonaws.com/sea2shore",
  cwd => "/tmp",
  creates => "/tmp/wildtracks.sql"
}

In the section below, we load the database with the SQL file. This builds our target environments with the production database content giving developers an exact replica sandbox to work in.

exec { "load-database":
  require => [
    Download_file["wildtracks.sql"],
    Exec["create-user"],
    Service["postgresql"],
    Exec["create-db-owner"]],
  command => "psql -U manatee_user -d manatees_wildtrack -f /tmp/wildtracks.sql"
  }
}

Lastly in our Puppet run, we install subversion and groovy on the target node. We could have just included these in our system module, but they seemed general purpose enough to create individual modules.

Subversion manifest:

class subversion {
  package { "subversion":
    ensure => "installed"
  }
}

Groovy manifest:

class groovy {
  Exec { path => '/usr/bin:/bin:/usr/sbin:/sbin' }

  define download_file($site="",$cwd="",$creates=""){
    exec { $name:
    command => "wget ${site}/${name}",
    cwd => $cwd,
    creates => "${cwd}/${name}"
    }
  }

  download_file {"groovy-1.8.2.tar.gz":
    site => "https://s3.amazonaws.com/sea2shore/resources/binaries",
    cwd => "/tmp",
    creates => "/tmp/groovy-1.8.2.tar.gz",
  }

  file { "/usr/bin/groovy-1.8.2/":
    ensure => "directory",
    owner => "root",
    group => "root",
    mode => 755,
    require => Download_file["groovy-1.8.2.tar.gz"],
  }

  exec { "extract-groovy":
    command => "tar -C /usr/bin/groovy-1.8.2/ -xvf /tmp/groovy-1.8.2.tar.gz",
    require => File["/usr/bin/groovy-1.8.2/"],
  }
}

The Subversion manifest is relatively straightforward as we are using the package resource. The Groovy manifest is slightly different, we are downloading the Groovy tar, placing it on the filesystem, and then extracting it.

We’ve gone through how the target environment is provisioned. We do however have one more task, testing. It’s not enough to assume that if Puppet doesn’t error out, that everything got installed successfully. For this reason, we use Cucumber to do acceptance testing against our environment. Our tests check if services are running, configuration files are present and if the right packages have been installed.

Puppet allows us to completely script and version our target environments. Consequently, this enables us to treat environments as disposable entities. As a practice, we create a new target environment every time our CD pipeline is run. This way we are always deploying against a known state.

As our blog series is coming to a close, let’s recap what we’ve gone through. In the Manatee infrastructure we use a combination of CloudFormation for scripting AWS resources, Puppet for scripting target environments, Capistrano for deployment automation, Simple DB and CloudFormation for dynamic properties and
Jenkins for coordinating all the resources into one cohesive unit for moving a Manatee application change from check-in to production in just a single click.

10-04-2012

Continuous Delivery in the Cloud: Deployment Automation (Part 5 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application. In part 2 I went over how we use this CD pipeline to deliver software from checkin to production. In part 3, we focused on how CloudFormation is used to script the virtual AWS components that create the Manatee infrastructure. Then in part 4, we focused on a “property file less” environment by dynamically setting and retrieving properties. A list of topics for each of the articles is summarized below:
Part 1: Introduction – Introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – In-depth look at the CD Pipeline;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – What you’re reading now;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

In this part of the series, I am going to show how we use Capistrano to script our deployments to target environments.

What is Capistrano?
Capistrano is an open source Ruby tool used for deploying web applications. It automates deploying to one or more servers. These deployments can include procedures like placing a war on a target server, database changes, starting services, etc.

A Capistrano script has several major parts

  • Namespaces: Namespaces in Capistrano are used for differentiating tasks from other tasks with the same name. This is important if you create a library out of your Capistrano deployment configuration, you will want to make sure your tasks are unique. For instance a typical name for a task is setup. You need to make sure that your setup task does not potentially interfere with another user’s custom setup task. By using namespaces, you won’t have this conflict.
  • Tasks: Tasks are used for performing specific operations. An example task would be setup. Inside the setup task you will generally prepare the server for subsequent steps to execute successfully like deleting the current application.
  • Variables: Variables in Capistrano are defined as ruby symbols. These are set initially and then referenced later on in the script.
  • Order of execution: Capistrano allows you to define the order of deployment execution. You do this with Capistrano’s built in feature after. With after you define the order of task execution during your Capistrano deployment.
  • Templates: Templates are files that have injected ruby snippets. These are used for dynamically building configuration files.
  • Roles: Roles define what part each server in your infrastructure plays in the deployment. Typical roles consist of db, web and app. Roles are then referenced inside your tasks to determine which server the task is run against.

Since Capistrano is a ruby based tool, you can inject ruby methods and operations to enhance Capistrano. In our deployment we use ruby for returning property values from SimpleDB – as we discussed in part 4 of this series, Dynamic Configuration. This enables us to dynamically deploy to target servers.

How do you install Capistrano?

1. Capistrano is available as a rubygem. You simply type gem install capistrano on your Linux machine (assuming you have ruby and rubygems installed)
2. Type capify . this will create a Capfile which is a main file that Capistrano needs and a config/deploy.rb file (which is your actual Capistrano deployment script).

How do you run a Capistrano script?
You run Capistrano from the command line. From the same directory as your Capfile, type cap namespace:task. namespace and task being your own Namespace and Task defined in your deploy.rb script. This will start your Capistrano deployment.

Why do we use Capistrano?
We use Capistrano in order to have a fully scripted, versioned deployment. Every step in our application deployment is scripted and fully automated – which reduces errors when deploying. This gives us complete control over our deployment and the ability to deploy whenever we are ready.

Capistrano for Manatees
In the Manatee deployment, we use Capistrano for deploying our Manatee tracking application to our target environment. I am going to go through each part of the deploy.rb and explain its use and purpose. In a deployment’s lifecycle, the deployment is run as part of the CD pipeline – discussed in part 2 of the series, CD Pipeline. I’ll first go through a high level summary of the deployment and then dive into more detail in the next section.

1. Variables are set, which includes returning several properties from SimpleDB
2. Roles are set: db, web and app are all set to the ip_address variable configured dynamically in Step #1
3. The order of execution is set to run the tasks in order
4. Tasks are executed
5. If all tasks are executed successfully, Capistrano signals deployment success.

Now that we know at a high level what’s being done during the deployment, lets take a deeper look at the inside of the script. The actual script can be found here: deploy.rb

Variables

Command line set
stack – Passed into SimpleDB to return the dynamically set property values
ssh_key – Used by the ssh_options variable to SSH into the target environment

Dynamically set
domain – Used by the application variable
artifact_bucket - Used to build the artifact_url variable
ip_address – Used to define the IP address of the target environment to SSH into
dataSourceUsername – Returns a value that is part of the wildtracks_config.properties file
dataSourcePassword – Returns a value that is part of the wildtracks_config.properties file
dataStorageFtpUsername – Returns a value that is part of the wildtracks_config.properties file
dataStorageFtpPassword – Returns a value that is part of the wildtracks_config.properties file

Hardcoded
user – The user to SSH into the target box as
use_sudo – Define whether to prepend every command with sudo or not
deploy_to – Defines the deployment directory on the target environment
artifact – The artifact to deploy to the target server
artifact_url – The URL for downloading the artifact
ssh_options – Specialized SSH configuration
application –  Used to set the domain that application runs on
liquibase_jar – Location of the liquibase.jar on the deployment server
postgres_jar – Location of the postgres.jar on the deployment server

Roles

Since the app server, web server, and database all co exist on the same environment, we set each of these to the same variable, ip_address.

Namspaces

Deploy: We use deploy as our namespace. Since we aren’t distributing this set of deployment tasks, we don’t need to make a unique namespace. In fact we could remove the namespace all-together, but we wanted to show it being used.

namespace :deploy

Execution

We define our execution order at the bottom of the script using after. This coordinates which task should be run during the deployment.

after "deploy:setup", "deploy:wildtracks_config"
after "deploy:wildtracks_config", "deploy:httpd_conf"
after "deploy:httpd_conf", "deploy:deploy"
after "deploy:deploy", "deploy:liquibase"
after "deploy:deploy", "deploy:restart"

Tasks

  • Setup: The setup task is our initial task. It makes sure the ownership of our deployment directory is set of tomcat. It then stops httpd and tomcat to get ready for the deployment.

    task :setup do
      run "sudo chown -R tomcat:tomcat #{deploy_to}"
      run "sudo service httpd stop"
      run "sudo service tomcat6 stop"
    end
  • wildtracks_config: The wildtracks_config task is the second task to run. It dynamically creates the wildtracks-config.properties file using a template and the variables set previously in the script. It then places the wildtracks-config.properties file on the target environment.

    task :wildtracks_config, :roles => :app do

      set :dataSourceUsername do
        item = sdb.domains["stacks"].items["wildtracks-config"]
        item.attributes['dataSourceUsername'].values[0].to_s.chomp
      end
      set :dataSourcePassword do
        item = sdb.domains["stacks"].items["wildtracks-config"]
        item.attributes['dataSourcePassword'].values[0].to_s.chomp
      end
      set :dataStorageFtpUsername do
        item = sdb.domains["stacks"].items["wildtracks-config"]
        item.attributes['dataStorageFtpUsername'].values[0].to_s.chomp
      end
      set :dataStorageFtpPassword do
        item = sdb.domains["stacks"].items["wildtracks-config"]
        item.attributes['dataStorageFtpPassword'].values[0].to_s.chomp
      end

      set :dataSourceUrl, "jdbc:postgresql://localhost:5432/manatees_wildtrack"
      set :dataStorageWorkDir, "/var/tmp/manatees_wildtracks_workdir"
      set :dataStorageFtpUrl, "ftp.wildtracks.org"
      set :databaseBackupScriptFile, "/usr/share/tomcat6/.sarvatix/manatees/wildtracks/database_backups/script/db_backup.sh"

      config_content = from_template("config/templates/wildtracks-config.properties.erb")
      put config_content, "/home/ec2-user/wildtracks-config.properties"

      run "sudo mv /home/ec2-user/wildtracks-config.properties /usr/share/tomcat6/.sarvatix/manatees/wildtracks/wildtracks-config.properties"
      run "sudo chown -R tomcat:tomcat /usr/share/tomcat6/.sarvatix/manatees/wildtracks/wildtracks-config.properties"
      run "sudo chmod 777 /usr/share/tomcat6/.sarvatix/manatees/wildtracks/wildtracks-config.properties"
    end

  • httpd_conf: The httpd_conf task is third on the stack and performs a similar function to the wildtracks_config task, but with the httpd.conf configuration file.

    task :httpd_conf, :roles => :app do

      config_content = from_template("config/templates/httpd.conf.erb")
      put config_content, "/home/ec2-user/httpd.conf"

      run "sudo mv /home/ec2-user/httpd.conf /etc/httpd/conf/httpd.conf"
    end

  • Deploy: The deploy task is where the actual deployment of the application code is done. This task removes the current version of the application and downloads the latest.

    task :deploy do
      run "cd #{deploy_to} && sudo rm -rf wildtracks* && sudo wget #{artifact_url}"
    end
  • Liquibase: The liquibase task sets up and ensures that the database is configured correctly.

    task :liquibase, :roles => :db do

      db_username = fetch(:dataSourceUsername)
      db_password = fetch(:dataSourcePassword)
      private_ip_address = fetch(:private_ip_address)

      set :liquibase_jar, "/usr/share/tomcat6/.grails/1.3.7/projects/Build/plugins/liquibase-1.9.3.6/lib/liquibase-1.9.3.jar"
      set :postgres_jar, "/usr/share/tomcat6/.ivy2/cache/postgresql/postgresql/jars/postgresql-8.4-701.jdbc3.jar"

      system("cp -rf /usr/share/tomcat6/.jenkins/workspace/DeployManateeApplication/grails-app/migrations/* /usr/share/tomcat6/.jenkins/workspace/DeployManateeApplication/")

      system("java -jar #{liquibase_jar}\
                    --classpath=#{postgres_jar}\
                    --changeLogFile=changelog.xml\
                    --username=#{db_username}\
                    --password=#{db_password}\
                    --url=jdbc:postgresql://#{private_ip_address}:5432/manatees_wildtrack\
    update")
    end

  • Restart: Lastly the restart task starts the httpd and tomcat services.

    task :restart, :roles => :app do
      run "sudo service httpd restart"
      run "sudo service tomcat6 restart"
    end

Now that we’ve gone through the deployment, we need to test it. For testing our deployments, we use Cucumber. Cucumber enables us to do acceptance testing on our deployment. We verify that the application is up and available, the correct services are started and the property files are stored in the right locations.

Capistrano allows us to completely script and version our deployments enabling our deployments to be run at anytime. With Capistrano’s automation in conjunction with Cucumber’s acceptance testing, we are given a high level of confidence in our deployments and that when they are run, the application will be deployed successfully.

In the next and last part of our series – Infrastructure Automation – we’ll go through scripting environment using an industry standard infrastructure automation tool, Puppet.

10-03-2012

Continuous Delivery in the Cloud: Dynamic Configuration (Part 4 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application. In part 2 I went over how we use this CD pipeline to deliver software from checkin to production. In part 3, we focused on how CloudFormation is used to script the virtual AWS components that create the Manatee infrastructure. A list of topics for each of the articles is summarized below:

Part 1: Introduction – Introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – In-depth look at the CD Pipeline;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration –  What you’re reading now;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

In this part of the series, I am going to explain how we dynamically generate our configuration and avoid property files whenever possible. Instead of using property files, we store and retrieve configuration on the fly – as part of the CD pipeline – without predefining these values in a static file (i.e. a properties file) ahead of time. We do this using two methods: AWS SimpleDB and CloudFormation.

SimpleDB is a highly available non-relational data storage service that only stores strings in key value pairs. CloudFormation, as discussed in Part 3 of the series, is a scripting language for allocating and configuring AWS virtual resources.

Using SimpleDB

Throughout the CD pipeline, we often need to manage state across multiple Jenkins jobs. To do this, we use SimpleDB. As the pipeline executes, values that will be needed by subsequent jobs get stored in SimpleDB as properties. When the properties are needed we use a simple Ruby script script to return the key/value pair from SimpleDB and then use it as part of the job. The values being stored and retrieved range from IP addresses and domain names to AMI (Machine Images) IDs.

So what makes this dynamic? As Jenkins jobs or CloudFormation templates are run, we often end up with properties that need to be used elsewhere. Instead of hard coding all of the values to be used in a property file, we create, store and retrieve them as the pipeline executes.

Below is the CreateTargetEnvironment Jenkins job script that creates a new target environment from a CloudFormation script production.template


if [ $deployToProduction ] == true
then
SSH_KEY=production
else
SSH_KEY=development
fi

# Create Cloudformaton Stack
ruby /usr/share/tomcat6/scripts/aws/create_stack.rb ${STACK_NAME} ${WORKSPACE}/production.template ${HOST} ${JENKINSIP} ${SSH_KEY} ${SGID} ${SNS_TOPIC}

# Load SimpleDB Domain with Key/Value Pairs
ruby /usr/share/tomcat6/scripts/aws/load_domain.rb ${STACK_NAME}

# Pull and store variables from SimpleDB
host=`ruby /usr/share/tomcat6/scripts/aws/showback_domain.rb ${STACK_NAME} InstanceIPAddress`

# Run Acceptance Tests
cucumber features/production.feature host=${host} user=ec2-user key=/usr/share/tomcat6/.ssh/id_rsa

Referenced above in the CreateTargetEnvironment code snippet. This is the load_domain.rb script that iterates over a file and sends key/value pairs to SimpleDB.

require 'rubygems'
require 'aws-sdk'
load File.expand_path('../../config/aws.config', __FILE__)

stackname=ARGV[0]

file = File.open("/tmp/properties", "r")

sdb = AWS::SimpleDB.new

AWS::SimpleDB.consistent_reads do
  domain = sdb.domains["stacks"]
  item = domain.items["#{stackname}"]

  file.each_line do|line|
    key,value = line.split '='
    item.attributes.set(
      "#{key}" => "#{value}")
  end
end

Referenced above in the CreateTargetEnvironment code snippet. This is the showback_domain.rb script which connects to SimpleDB and returns a key/value pair.

load File.expand_path('../../config/aws.config', __FILE__)

item_name=ARGV[0]
key=ARGV[1]

sdb = AWS::SimpleDB.new

AWS::SimpleDB.consistent_reads do
  domain = sdb.domains["stacks"]
  item = domain.items["#{item_name}"]

  item.attributes.each_value do |name, value|
    if name == "#{key}"
      puts "#{value}".chomp
    end
  end
end

In the above in the CreateTargetEnvironment code snippet, we store the outputs of the CloudFormation stack in a temporary file. We then iterate over the file with the load_domain.rb script and store the key/value pairs in SimpleDB.

Following this, we make a call to SimpleDB with the showback_domain.rb script and return the instance IP address (created in the CloudFormation template) and store it in the host variable. host is then used by cucumber to ssh into the target instance and run the acceptance tests.

Using CloudFormation

In our CloudFormation templates we allocate multiple AWS resources. Every time we run the template, a different resource is being used. For example, in our jenkins.template we create a new IAM user. Every time we run the template a different IAM user with different credentials is created. We need a way to reference these resources. This is where CloudFormation comes in. You can reference resources within other resources throughout the script. You can define a reference to another resource using the Ref function in CloudFormation. Using Ref, you can dynamically refer to values of other resources such as an IP Address, domain name, etc.

In the script we are creating an IAM user, referencing the IAM user to create AWS Access keys and then storing them in an environment variable.


"CfnUser" : {
  "Type" : "AWS::IAM::User",
  "Properties" : {
    "Path": "/",
    "Policies": [{
      "PolicyName": "root",
      "PolicyDocument": {
        "Statement":[{
          "Effect":"Allow",
          "Action":"*",
          "Resource":"*"
        }
      ]}
    }]
  }
},

"HostKeys" : {
  "Type" : "AWS::IAM::AccessKey",
  "Properties" : {
    "UserName" : { "Ref": "CfnUser" }
  }
},

"# Add AWS Credentials to Tomcat\n",
"echo \"AWS_ACCESS_KEY=", { "Ref" : "HostKeys" }, "\" >> /etc/sysconfig/tomcat6\n",
"echo \"AWS_SECRET_ACCESS_KEY=", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}, "\" >> /etc/sysconfig/tomcat6\n",

We can then use these access keys in other scripts by referencing the $AWS_ACCESS_KEY and $AWS_SECRET_ACCESS_KEY environment variables.

How is this different from typical configuration management?

Typically in many organizations, there’s a big property with hard coded key/value pairs that gets passed into the pipeline. The pipeline executes using the given parameters and cannot scale or change without a user modifying the property file. They are unable to scale or adapt because all of the properties are hard coded, if the property file hard codes the IP to an EC2 instance and it goes down for whatever reason, their pipeline doesn’t work until someone fixes the property file. There are more effective ways of doing this when using the cloud. The cloud is provides on-demand resources that will constantly be changing. These resources will have different IP addresses, domain names, etc associated with them every time.

With dynamic configuration, there are no property files, every property is generated as part of the pipeline.

With this dynamic approach, the pipeline values change with every run. As new cloud resources are allocated, the pipeline is able to adjust itself and automatically without the need for users to constantly modify property files. This leads to less time spent debugging those cumbersome property file management issues that plague most companies.

In the next part of our series – which is all about Deployment Automation – we’ll go through scripting and testing your deployment using industry-standard tools. In this next article, you’ll see how to orchestrate deployment sequences and configuration using Capistrano.

09-25-2012

Continuous Delivery in the Cloud: CloudFormation (Part 3 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application. In part 2 I went over how we use this CD pipeline to deliver software from checkin to production. A list of topics for each of the articles is summarized below.

Part 1: Introduction – introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – In-depth look at the CD Pipeline
Part 3: CloudFormation – What you’re reading now
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

In this part of the series, I am going to explain how we use CloudFormation to script our AWS infrastructure and provision our Jenkins environment.

What is CloudFormation?
CloudFormation is an AWS offering for scripting AWS virtual resource allocation. A CloudFormation template is a JSON script which references various AWS resources that you want to use. When the template runs, it will allocate the AWS resources accordingly.

A CloudFormation template is split up into four sections:

  1. Parameters: Parameters are values that you define in the template. When creating the stack through the AWS console, you will be prompted to enter in values for the Parameters. If the value for the parameter generally stays the same, you can set a default value. Default values can be overridden when creating the stack. The parameter can be used throughout the template by using the “Ref” function.
  2. Mappings: Mappings are for specifying conditional parameter values in your template. For instance you might want to use a different AMI depending on the region your instance is running on. Mappings will enable you to switch AMIs depending on the region the instance is being created in.
  3. Resources: Resources are the most vital part of the CloudFormation template. Inside the resource section, you define and configure your AWS components.
  4. Outputs: After the stack resources are created successfully, you may want to have it return values such as the IP address or the domain of the created instance. You use Outputs for this. Outputs will return the values to the AWS console or command line depending on which medium you use for creating a stack.

CloudFormation parameters, and resources can be referenced throughout the template. You do this using intrinsic functions, Ref, Fn::Base64, Fn::FindInMap, Fn::GetAtt, Fn::GetAZs and Fn::Join. These functions enable you to pass properties and resource outputs throughout your template – reducing the need for most hardcoded properties (something I will discuss in part 4 of this series, Dynamic Configuration).

How do you run a CloudFormation template?
You can create a CloudFormation stack using either the AWS Console, CloudFormation CLI tools or the CloudFormation API.

Why do we use CloudFormation?
We use CloudFormation in order to have a fully scripted, versioned infrastructure. From the application to the virtual resources, everything is created from a script and is checked into version control. This gives us complete control over our AWS infrastructure which can be recreated whenever necessary.

CloudFormation for Manatees
In the Manatee Infrastructure, we use CloudFormation for setting up the Jenkins CD environment. I am going to go through each part of the jenkins template and explain its use and purpose. In template’s lifecycle, the user launches the stack using the jenkins.template and enters in the Parameters. The template then starts to work:

1. IAM User with AWS Access keys is created
2. SNS Topic is created
3. CloudWatch Alarm is created and SNS topic is used for sending alarm notifications
4. Security Group is created
5. Wait Condition created
6. Jenkins EC2 Instance is created with the Security Group from step #4. This security group is used for port configuration. It also uses AWSInstanceType2Arch and AWSRegionArch2AMI to decide what AMI and OS type to use
7. Jenkins EC2 Instance runs UserData script and executes cfn_init.
8. Wait Condition waits for Jenkins EC2 instance to finish UserData script
9. Elastic IP is allocated and associated with Jenkins EC2 instance
10. Route53 domain name created and associated with Jenkins Elastic IP
11. If everything creates successfully, the stack signals complete and outputs are displayed

Now that we know at a high level what is being done, lets take a deeper look at what’s going on inside the jenkins.template.

Parameters

  • Email: Email address that SNS notifications will be sent. When we create or deploy to target environments, we use SNS to notify us of their status.
  • ApplicationName: Name of A Record created by Route53. Inside the template, we dynamically create a domain with A record for easy access to the instance after creation. Example: jenkins.integratebutton.com, jenkins is the ApplicationName
  • HostedZone: Name of Domain used Route53. Inside the template, we dynamically create a domain with A record for easy access to the instance after creation. Example: jenkins.integratebutton.com, integratebutton.com is the HostedZone.
  • KeyName: EC2 SSH Keypair to create the Instance with. This is the key you use to ssh into the Jenkins instance after creation.
  • InstanceType: Size of the EC2 instance. Example: t1.micro, c1.medium
  • S3Bucket: We use a S3 bucket for containing the resources for the Jenkins template to use, this parameter specifies the name of the bucket to use for this.

 

Mappings


"Mappings" : {
  "AWSInstanceType2Arch" : {
    "t1.micro" : { "Arch" : "64" },
    "m1.small" : { "Arch" : "32" },
    "m1.large" : { "Arch" : "64" },
    "m1.xlarge" : { "Arch" : "64" },
    "m2.xlarge" : { "Arch" : "64" },
    "m2.2xlarge" : { "Arch" : "64" },
    "m2.4xlarge" : { "Arch" : "64" },
    "c1.medium" : { "Arch" : "64" },
    "c1.xlarge" : { "Arch" : "64" },
    "cc1.4xlarge" : { "Arch" : "64" }
  },
    "AWSRegionArch2AMI" : {
    "us-east-1" : { "32" : "ami-ed65ba84", "64" : "ami-e565ba8c" }
  }
},

These Mappings are used to define what type of operating system architecture and AWS AMI (Amazon Machine Image) ID to use to use based upon the Instance size. The instance size is specified using the Parameter InstanceType

The conditional logic to interact with the Mappings is done inside the EC2 instance.

"ImageId" : { "Fn::FindInMap" : [ "AWSRegionArch2AMI", { "Ref" : "AWS::Region" }, { "Fn::FindInMap" : [ "AWSInstanceType2Arch", { "Ref" : "InstanceType" }, "Arch" ] } ] },


Resources

AWS::IAM::User

"CfnUser" : {
  "Type" : "AWS::IAM::User",
  "Properties" : {
    "Path": "/",
    "Policies": [{
      "PolicyName": "root",
      "PolicyDocument": { "Statement":[{
        "Effect":"Allow",
        "Action":"*",
        "Resource":"*"
        }
      ]}
    }]
  }
},


"Type" : "AWS::IAM::AccessKey",
"Properties" : {
  "UserName" : { "Ref": "CfnUser" }
}

We create the AWS IAM user and then create the AWS Access and Secret access keys for the IAM user which are used throughout the rest of the template. Access and Secret access keys are authentication keys used to authenticate to the AWS account.

AWS::SNS::Topic

"MySNSTopic" : {
  "Type" : "AWS::SNS::Topic",
  "Properties" : {
    "Subscription" : [ {
      "Endpoint" : { "Ref": "Email" },
      "Protocol" : "email"
    } ]
  }
},

SNS is a highly available solution for sending notifications. In the Manatee infrastructure it is used for sending notifications to the development team.

AWS::Route53::RecordSetGroup

"JenkinsDNS" : {
  "Type" : "AWS::Route53::RecordSetGroup",
  "Properties" : {
    "HostedZoneName" : { "Fn::Join" : [ "", [ {"Ref" : "HostedZone"}, "." ]]},
    "RecordSets" : [{
      "Name" : { "Fn::Join" : ["", [ { "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }, "." ]]},
      "Type" : "A",
      "TTL" : "900",
      "ResourceRecords" : [ { "Ref" : "IPAddress" } ]
    }]
  }
},

Route53 is a highly available DNS service. We use Route53 to create domains dynamically using the given HostedZone and ApplicationName parameters. If the parameters are not overriden, the domain jenkins.integratebutton.com will be created. We then reference the Elastic IP and associate it with the created domain. This way the jenkins.integratebutton.com domain will route to the created instance

AWS::EC2::Instance

EC2 gives access to on-demand compute resources. In this template, we allocate a new EC2 instance and configure it with a Keypair, Security Group, and Image ID (AMI). Then for provisioning the EC2 instance we use the UserData property. Inside UserData we run a set of bash commands along with cfn_init. The UserData script is run during instance creation.

"WebServer": {
  "Type": "AWS::EC2::Instance",
  "Metadata" : {
    "AWS::CloudFormation::Init" : {
      "config" : {
        "packages" : {
          "yum" : {
            "tomcat6" : [],
            "subversion" : [],
            "git" : [],
            "gcc" : [],
            "libxslt-devel" : [],
            "ruby-devel" : [],
            "httpd" : []
          }
        },

        "sources" : {
          "/opt/aws/apitools/cfn" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/resources/aws_tools/cfn-cli.tar.gz"]]},
          "/opt/aws/apitools/sns" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/resources/aws_tools/sns-cli.tar.gz"]]}
        },

        "files" : {
          "/usr/share/tomcat6/webapps/jenkins.war" : {
            "source" : "http://mirrors.jenkins-ci.org/war/1.480/jenkins.war",
            "mode" : "000700",
            "owner" : "tomcat",
            "group" : "tomcat",
            "authentication" : "S3AccessCreds"
          },

          "/usr/share/tomcat6/webapps/nexus.war" : {
            "source" : "http://www.sonatype.org/downloads/nexus-2.0.3.war",
            "mode" : "000700",
            "owner" : "tomcat",
            "group" : "tomcat",
            "authentication" : "S3AccessCreds"
          },

          "/usr/share/tomcat6/.ssh/id_rsa" : {
            "source" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/private/id_rsa"]]},
            "mode" : "000600",
            "owner" : "tomcat",
            "group" : "tomcat",
            "authentication" : "S3AccessCreds"
          },

          "/home/ec2-user/common-step-definitions-1.0.0.gem" : {
            "source" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/gems/common-step-definitions-1.0.0.gem"]]},
            "mode" : "000700",
            "owner" : "root",
            "group" : "root",
            "authentication" : "S3AccessCreds"
          },

          "/etc/cron.hourly/jenkins_backup.sh" : {
            "source" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/jenkins_backup.sh"]]},
            "mode" : "000500",
            "owner" : "root",
            "group" : "root",
            "authentication" : "S3AccessCreds"
          },

          "/etc/tomcat6/server.xml" : {
            "source" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/server.xml"]]},
            "mode" : "000554",
            "owner" : "root",
            "group" : "root",
            "authentication" : "S3AccessCreds"
          },

          "/usr/share/tomcat6/aws_access" : {
            "content" : { "Fn::Join" : ["", [
              "AWSAccessKeyId=", { "Ref" : "HostKeys" }, "\n",
              "AWSSecretKey=", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}
            ]]},
            "mode" : "000400",
            "owner" : "tomcat",
            "group" : "tomcat",
            "authentication" : "S3AccessCreds"
          },

          "/opt/aws/aws.config" : {
            "content" : { "Fn::Join" : ["", [
              "AWS.config(\n",
              ":access_key_id => \"", { "Ref" : "HostKeys" }, "\",\n",
              ":secret_access_key => \"", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}, "\")\n"
            ]]},
            "mode" : "000500",
            "owner" : "tomcat",
            "group" : "tomcat"
          },

          "/etc/httpd/conf/httpd.conf2" : {
            "content" : { "Fn::Join" : ["", [
              "NameVirtualHost *:80\n",
              "\n",
              "ProxyPass /jenkins http://", { "Fn::Join" : ["", [{ "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }]] }, ":8080/jenkins\n",
              "ProxyPassReverse /jenkins http://", { "Fn::Join" : ["", [{ "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }]] }, ":8080/jenkins\n",
              "ProxyRequests Off\n",

              "\n",
              "Order deny,allow\n",
              "Allow from all\n",
              "\n",
              "RewriteEngine On\n",
              "RewriteRule ^/$ http://", { "Fn::Join" : ["", [{ "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }]] }, ":8080/jenkins$1 [NC,P]\n",
""
            ]]},
            "mode" : "000544",
            "owner" : "root",
            "group" : "root"
          },

          "/root/.ssh/config" : {
            "content" : { "Fn::Join" : ["", [
              "Host github.com\n",
              "StrictHostKeyChecking no\n"
            ]]},
            "mode" : "000600",
            "owner" : "root",
            "group" : "root"
          },

          "/usr/share/tomcat6/.route53" : {
            "content" : { "Fn::Join" : ["", [
              "access_key: ", { "Ref" : "HostKeys" }, "\n",
              "secret_key: ", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}, "\n",
              "api: '2012-02-29'\n",
              "endpoint: https://route53.amazonaws.com/\n",
              "default_ttl: '3600'"
            ]]},
            "mode" : "000700",
            "owner" : "tomcat",
            "group" : "tomcat"
          }
        }
      }
    },
    "AWS::CloudFormation::Authentication" : {
      "S3AccessCreds" : {
        "type" : "S3",
        "accessKeyId" : { "Ref" : "HostKeys" },
        "secretKey" : {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]},
        "buckets" : [ { "Ref" : "S3Bucket"} ]
      }
    }
  },
  "Properties": {
    "ImageId" : { "Fn::FindInMap" : [ "AWSRegionArch2AMI", { "Ref" : "AWS::Region" }, { "Fn::FindInMap" : [ "AWSInstanceType2Arch", { "Ref" : "InstanceType" }, "Arch" ] } ] },
    "InstanceType" : { "Ref" : "InstanceType" },
    "SecurityGroups" : [ {"Ref" : "FrontendGroup"} ],
    "KeyName" : { "Ref" : "KeyName" },
    "Tags": [ { "Key": "Name", "Value": "Jenkins" } ],
    "UserData" : { "Fn::Base64" : { "Fn::Join" : ["", [
      "#!/bin/bash -v\n",
      "yum -y install java-1.6.0-openjdk*\n",
      "yum update -y aws-cfn-bootstrap\n",

      "# Install packages\n",
      "/opt/aws/bin/cfn-init -s ", { "Ref" : "AWS::StackName" }, " -r WebServer ",
      " --access-key ", { "Ref" : "HostKeys" },
      " --secret-key ", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]},
      " --region ", { "Ref" : "AWS::Region" }, " || error_exit 'Failed to run cfn-init'\n",

      "# Copy Github credentials to root ssh directory\n",
      "cp /usr/share/tomcat6/.ssh/* /root/.ssh/\n",

      "# Installing Ruby 1.9.3 from RPM\n",
      "wget -P /home/ec2-user/ https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/resources/rpm/ruby-1.9.3p0-2.amzn1.x86_64.rpm\n",
      "rpm -Uvh /home/ec2-user/ruby-1.9.3p0-2.amzn1.x86_64.rpm\n",

      "cat /etc/httpd/conf/httpd.conf2 >> /etc/httpd/conf/httpd.conf\n",

      "# Install S3 Gems\n",
      "gem install /home/ec2-user/common-step-definitions-1.0.0.gem\n",

      "# Install Public Gems\n",
      "gem install bundler --version 1.1.4 --no-rdoc --no-ri\n",
      "gem install aws-sdk --version 1.5.6 --no-rdoc --no-ri\n",
      "gem install cucumber --version 1.2.1 --no-rdoc --no-ri\n",
      "gem install net-ssh --version 2.5.2 --no-rdoc --no-ri\n",
      "gem install capistrano --version 2.12.0 --no-rdoc --no-ri\n",
      "gem install route53 --version 0.2.1 --no-rdoc --no-ri\n",
      "gem install rspec --version 2.10.0 --no-rdoc --no-ri\n",
      "gem install trollop --version 2.0 --no-rdoc --no-ri\n",

      "# Update Jenkins with versioned configuration\n",
      "rm -rf /usr/share/tomcat6/.jenkins\n",
      "git clone git@github.com:stelligent/continuous_delivery_open_platform_jenkins_configuration.git /usr/share/tomcat6/.jenkins\n",

      "# Get S3 bucket publisher from S3\n",
      "wget -P /usr/share/tomcat6/.jenkins/ https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/hudson.plugins.s3.S3BucketPublisher.xml\n",

      "wget -P /tmp/ https://raw.github.com/stelligent/continuous_delivery_open_platform/master/config/aws/cd_security_group.rb\n",
      "ruby /tmp/cd_security_group --securityGroupName ", { "Ref" : "FrontendGroup" }, " --port 5432\n",

      "# Update main Jenkins config\n",
      "sed -i 's@.*@", { "Ref" : "HostKeys" }, "@' /usr/share/tomcat6/.jenkins/hudson.plugins.s3.S3BucketPublisher.xml\n",
      "sed -i 's@.*@", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}, "@' /usr/share/tomcat6/.jenkins/hudson.plugins.s3.S3BucketPublisher.xml\n",

      "# Add AWS Credentials to Tomcat\n",
      "echo \"AWS_ACCESS_KEY=", { "Ref" : "HostKeys" }, "\" >> /etc/sysconfig/tomcat6\n",
      "echo \"AWS_SECRET_ACCESS_KEY=", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}, "\" >> /etc/sysconfig/tomcat6\n",

      "# Add AWS CLI Tools\n",
      "echo \"export AWS_CLOUDFORMATION_HOME=/opt/aws/apitools/cfn\" >> /etc/sysconfig/tomcat6\n",
      "echo \"export AWS_SNS_HOME=/opt/aws/apitools/sns\" >> /etc/sysconfig/tomcat6\n",
      "echo \"export PATH=$PATH:/opt/aws/apitools/sns/bin:/opt/aws/apitools/cfn/bin\" >> /etc/sysconfig/tomcat6\n",

      "# Add Jenkins Environment Variable\n",
      "echo \"export SNS_TOPIC=", { "Ref" : "MySNSTopic" }, "\" >> /etc/sysconfig/tomcat6\n",
      "echo \"export JENKINS_DOMAIN=", { "Fn::Join" : ["", ["http://", { "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }]] }, "\" >> /etc/sysconfig/tomcat6\n",
      "echo \"export JENKINS_ENVIRONMENT=", { "Ref" : "ApplicationName" }, "\" >> /etc/sysconfig/tomcat6\n",

      "wget -P /tmp/ https://raw.github.com/stelligent/continuous_delivery_open_platform/master/config/aws/showback_domain.rb\n",
      "echo \"export SGID=`ruby /tmp/showback_domain.rb --item properties --key SGID`\" >> /etc/sysconfig/tomcat6\n",

      "chown -R tomcat:tomcat /usr/share/tomcat6/\n",
      "chmod +x /usr/share/tomcat6/scripts/aws/*\n",
      "chmod +x /opt/aws/apitools/cfn/bin/*\n",

      "service tomcat6 restart\n",
      "service httpd restart\n",

      "/opt/aws/bin/cfn-signal", " -e 0", " '", { "Ref" : "WaitHandle" }, "'"
    ]]}}
  }
},

Calling cfn init from UserData


"# Install packages\n",
"/opt/aws/bin/cfn-init -s ", { "Ref" : "AWS::StackName" }, " -r WebServer ",
" --access-key ", { "Ref" : "HostKeys" },
" --secret-key ", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]},
" --region ", { "Ref" : "AWS::Region" }, " || error_exit 'Failed to run cfn-init'\n",
},

cfn_init is used to retrieve and interpret the resource metadata, installing packages, creating files and starting services. In the Manatee template we use cfn_init for easy access to other AWS resources, such as S3.

"/etc/tomcat6/server.xml" : {
  "source" : { "Fn::Join" : ["", ["https://s3.amazonaws.com/", { "Ref" : "S3Bucket" }, "/server.xml"]]},
  "mode" : "000554",
  "owner" : "root",
  "group" : "root",
  "authentication" : "S3AccessCreds"
},


"AWS::CloudFormation::Authentication" : {
  "S3AccessCreds" : {
    "type" : "S3",
    "accessKeyId" : { "Ref" : "HostKeys" },
    "secretKey" : {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]},
    "buckets" : [ { "Ref" : "S3Bucket"} ]
  }
}

When possible, we try to use cfn_init rather than UserData bash commands because it stores a detailed log of Cfn events on the instance.

AWS::EC2::SecurityGroup

When creating a Jenkins instance, we only want certain ports to be open and only open to certain users. For this we use Security Groups. Security groups are firewall rules defined at the AWS level. You can use them to set which ports, or range of ports to be opened. In addition to defining which ports are to be open, you can define who they should be open to using CIDR.


"FrontendGroup" : {
  "Type" : "AWS::EC2::SecurityGroup",
  "Properties" : {
    "GroupDescription" : "Enable SSH and access to Apache and Tomcat",
    "SecurityGroupIngress" : [
      {"IpProtocol" : "tcp", "FromPort" : "22", "ToPort" : "22", "CidrIp" : "0.0.0.0/0"},
      {"IpProtocol" : "tcp", "FromPort" : "8080", "ToPort" : "8080", "CidrIp" : "0.0.0.0/0"},
      {"IpProtocol" : "tcp", "FromPort" : "80", "ToPort" : "80", "CidrIp" : "0.0.0.0/0"}
    ]
  }
},

In this security group we are opening ports 22, 80 and 8080. Since we are opening 8080, we are able to access Jenkins at the completion of the template. By default, ports on an instance are closed, meaning these are necessary to be specified in order to have access to Jenkins.

AWS::EC2::EIP

When an instance is created, it is given a public DNS name similar to: ec2-107-20-139-148.compute-1.amazonaws.com. By using Elastic IPs, you can associate your instance an IP rather than a DNS.


"IPAddress" : {
  "Type" : "AWS::EC2::EIP"
},

"IPAssoc" : {
  "Type" : "AWS::EC2::EIPAssociation",
  "Properties" : {
    "InstanceId" : { "Ref" : "WebServer" },
    "EIP" : { "Ref" : "IPAddress" }
  }
},

In the snippets above, we create a new Elastic IP and then associate it with the EC2 instance created above. We do this so we can reference the Elastic IP when creating the Route53 Domain name.

AWS::CloudWatch::Alarm

"CPUAlarmLow": {
  "Type": "AWS::CloudWatch::Alarm",
  "Properties": {
    "AlarmDescription": "Scale-down if CPU < 70% for 10 minutes",
    "MetricName": "CPUUtilization",
    "Namespace": "AWS/EC2",
    "Statistic": "Average",
    "Period": "300",
    "EvaluationPeriods": "2",
    "Threshold": "70",
    "AlarmActions": [ { "Ref": "SNSTopic" } ],
    "Dimensions": [{
      "Name": "WebServerName",
      "Value": { "Ref": "WebServer" }
    }],
    "ComparisonOperator": "LessThanThreshold"
  }
},

There are many reasons an instance can become unavailable. CloudWatch is used to monitor instance usage and performance. CloudWatch can be set to notify specified individuals if the instance experiences higher than normal CPU utilization, disk usage, network usage, etc. In the Manatee infrastructure we use CloudWatch to monitor disk utilization and notify team members if it reaches 90 percent.

If the Jenkins instance goes down, our CD pipeline becomes temporarily unavailable. This presents a problem as the development team is temporarily blocked from testing their code. CloudWatch helps notify us if this is an impending problem..

AWS::CloudFormation::WaitConditionHandle, AWS::CloudFormation::WaitCondition

Wait Conditions are used to wait for all of the resources in a template to be completed before signally template success.

"WaitHandle" : {
  "Type" : "AWS::CloudFormation::WaitConditionHandle"
},

"WaitCondition" : {
  "Type" : "AWS::CloudFormation::WaitCondition",
  "DependsOn" : "WebServer",
  "Properties" : {
    "Handle" : { "Ref" : "WaitHandle" },
    "Timeout" : "990"
  }
}

When creating the instance, if a wait condition is not used, CloudFormation won’t wait for the completion of the UserData script. It will signal success if the EC2 instance is allocated successfully rather than waiting for the UserData script to run and signal success.

Outputs

Outputs are used to return information from what was created during the CloudFormaiton stack creation to the user. In order to return values, you define the Output name and then the resource you want to reference:


"Outputs" : {
  "Domain" : {
    "Value" : { "Fn::Join" : ["", ["http://", { "Ref" : "ApplicationName" }, ".", { "Ref" : "HostedZone" }]] },
    "Description" : "URL for newly created Jenkins app"
  },
  "NexusURL" : {
    "Value" : { "Fn::Join" : ["", ["http://", { "Ref" : "IPAddress" }, ":8080/nexus"]] },
    "Description" : "URL for newly created Nexus repository"
  },
  "InstanceIPAddress" : {
    "Value" : { "Ref" : "IPAddress" }
  }
}

For instance with the InstanceIPAddress, we are refernceing the IPAddress resource which happens to be the Elastic IP. This will return the Elastic IP address to the CloudFormation console.

CloudFormation allows us to completely script and version our infrastructure. This enables our infrastructure to be recreated the same way every time by just running the CloudFormation template. Because of this, your environments can be run in a Continuous integration cycle, rebuilding with every change in the script.

In the next part of our series – which is all about Dynamic Configuration – we’ll go through building your infrastructure to only require a minimal amount of hard coded properties if any. In this next article, you’ll see how you can use CloudFormation to build “property file less” infrastructure.

Resources:

09-18-2012

Continuous Delivery in the Cloud: CD Pipeline (Part 2 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application and how we use this pipeline to deliver software from checkin to production. In this article I will take an in-depth look at the CD pipeline. A list of topics for each of the articles is summarized below.

Part 1: Introduction – Introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – What you’re reading now;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

The CD pipeline consists of five Jenkins jobs. These jobs are configured to run one after the other. If any one of the jobs fail, the pipeline fails and that release candidate cannot be released to production. The five Jenkins jobs are listed below (further details of these jobs are provided later in the article).

  1. 1) A job that set the variables used throughout the pipeline (SetupVariables)
  2. 2) Build job (Build)
  3. 3) Production database update job (StoreLatestProductionData)
  4. 4) Target environment creation job (CreateTargetEnvironment)
  5. 5) A deployment job (DeployManateeApplication) which enables a one-click deployment into production.

We used Jenkins plugins to add additional features to the core Jenkins configuration. You can extend the standard Jenkins setup by using Jenkins plugins. A list of the plugins we use for the Sea to Shore Alliance Continuous Delivery configuration are listed below.

Grails: http://updates.jenkins-ci.org/download/plugins/grails/1.5/grails.hpi
Groovy: http://updates.jenkins-ci.org/download/plugins/groovy/1.12/groovy.hpi
Subversion: http://updates.jenkins-ci.org/download/plugins/subversion/1.40/subversion.hpi
Paramterized Trigger: http://updates.jenkins-ci.org/download/plugins/parameterized-trigger/2.15/parameterized-trigger.hpi
Copy Artifact: http://updates.jenkins-ci.org/download/plugins/copyartifact/1.21/copyartifact.hpi
Build Pipeline: http://updates.jenkins-ci.org/download/plugins/build-pipeline-plugin/1.2.3/build-pipeline-plugin.hpi
Ant: http://updates.jenkins-ci.org/download/plugins/ant/1.1/ant.hpi
S3: http://updates.jenkins-ci.org/download/plugins/s3/0.2.0/s3.hpi

The parameterized trigger, build pipeline and S3 plugins are used for moving the application through the pipeline jobs. The Ant, Groovy, and Grails plugins are used for running the build for the application code. Subversion for polling and checking out from version control.

Below, I describe each of the jobs that make up the CD pipeline in greater detail.

SetupVariables: Jenkins job used for entering in necessary property values which are propagated along the rest of the pipeline.

Parameter: STACK_NAME
Type: String
Where: Used in both CreateTargetEnvironment and DeployManateeApplication jobs
Purpose: Defines the CloudFormation Stack name and SimpleDB property domain associated with the CloudFormation stack.

Parameter: HOST
Type: String
Where: Used in both CreateTargetEnvironment and DeployManateeApplication jobs
Purpose: Defines the CNAME of the domain created in the CreateTargetEnvironment job. The DeployManateeApplication job uses it when it dynamically creates configuration files. For instance, in test.oneclickdeployment.com, test would be the HOST

Parameter: PRODUCTION_IP
Type: String
Where: Used in the StoreProductionData job
Purpose: Sets the production IP for the job so that it can SSH into the existing production environment and run a database script that exports the data and uploads it to S3.

Parameter: deployToProduction
Type: Boolean
Where: Used in both CreateTargetEnvironment and DeployManateeApplication jobs
Purpose: Determines whether to use the development or production SSH keypair.

In order for the parameters to propagate through the pipeline, we pass the current build parameters using the parametrized build trigger plugin

Build: Compiles the Manatee application’s Grails source code and creates a WAR file.

To do this, we utilize a Jenkins grails plugin and run grails targets such as compile and prod war. Next, we archive the grails migrations for use in the DeployManateeApplication job and then the job pushes the Manatee WAR up to S3 which is used as an artifact repository.

Lastly, using the trigger parametrized build plugin, we trigger the StoreProductionData job with the current build parameters.

StoreProductionData: This job performs a pg dump (PostgreSQL dump) of the production database and then stores it up in S3 for the environment creation job to use when building up the environment. Below is a snippet from this job.

ssh -i /usr/share/tomcat6/development.pem -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no ec2-user@${PRODUCTION_IP} ruby /home/ec2-user/database_update.rb

On the target environments created using the CD pipeline, a database script is stored. The script goes into the PostgreSQL database and runs a pg_dump. It then pushes the pg_dump SQL file to S3 to be used when creating the target environment.

After the SQL file is stored successfully, the CreateTargetEnvironment job is triggered.

CreateTargetEnvironment: Creates a new target environment using a CloudFormation template to create all the AWS resources and calls puppet to provision the environment itself from a base operating system to a fully working target environment ready for deployment. Below is a snippet from this job.

if [ $deployToProduction ]
then
SSH_KEY=development
else
SSH_KEY=production
fi

# Create Cloudformaton Stack
ruby ${WORKSPACE}/config/aws/create_stack.rb ${STACK_NAME} ${WORKSPACE}/infrastructure/manatees/production.template ${HOST} ${JENKINSIP} ${SSH_KEY} ${SGID} ${SNS_TOPIC}

# Load SimpleDB Domain with Key/Value Pairs
ruby ${WORKSPACE}/config/aws/load_domain.rb ${STACK_NAME}

# Pull and store variables from SimpleDB
host=`ruby ${WORKSPACE}/config/aws/showback_domain.rb ${STACK_NAME} InstanceIPAddress`

# Run Acceptance Tests
cucumber ${WORKSPACE}/infrastructure/manatees/features/production.feature host=${host} user=ec2-user key=/usr/share/tomcat6/.ssh/id_rsa

# Publish notifications to SNS
sns-publish --topic-arn $SNS_TOPIC --subject "New Environment Ready" --message "Your new environment is ready. IP Address: $host. An example command to ssh into the box would be: ssh -i development.pem ec2-user@$host This instance was created by $JENKINS_DOMAIN" --aws-credential-file /usr/share/tomcat6/aws_access

Once the environment is created, a set of Cucumber tests is run to ensure it’s in the correct working state. If any test fails, the entire pipeline fails and the developer is notified something went wrong. Otherwise if it passes, the DeployManateeApplication job is kicked off and an AWS SNS email notification with information to access the new instance is sent to the developer.

DeployManateeApplication: Runs a Capistrano script which uses steps in order to coordinate the deployment. A snippet from this job is displayed below.

if [ !$deployToProduction ]
then
SSH_KEY=/usr/share/tomcat6/development.pem
else
SSH_KEY=/usr/share/tomcat6/production.pem
fi

#/usr/share/tomcat6/.ssh/id_rsa

cap deploy:setup stack=${STACK_NAME} key=${SSH_KEY}

sed -i "s@manatee0@${HOST}@" ${WORKSPACE}/deployment/features/deployment.feature

host=`ruby ${WORKSPACE}/config/aws/showback_domain.rb ${STACK_NAME} InstanceIPAddress`
cucumber deployment/features/deployment.feature host=${host} user=ec2-user key=${SSH_KEY} artifact=

sns-publish --topic-arn $SNS_TOPIC --subject "Manatee Application Deployed" --message "Your Manatee Application has been deployed successfully. You can view it by going to http://$host/wildtracks This instance was deployed to by $JENKINS_DOMAIN" --aws-credential-file /usr/share/tomcat6/aws_access

This deployment job is the final piece of the delivery pipeline, it pulls together all of the pieces created in the previous jobs to successfully deliver working software.

During the deployment, the Capistrano script SSH’s into the target server, deploys the new war and updated configuration changes and restarts all services. Then the Cucumber tests are run to ensure the application is available and running successfully. Assuming the tests pass, an AWS SNS email gets dispatched to the developer with information on how to access their new development application

We use Jenkins as the orchestrator of the pipeline. Jenkins executes a set of scripts and passes around parameters as it runs each job. Because of the role Jenkins plays, we want to make sure it’s treated the same way as application – meaning versioning and testing all of our changes to the system. For example, if a developer modifies the create environment job configuration, we want to have the ability to revert back if necessary. Due to this requirement we version the Jenkins configuration. The jobs, plugins and main configuration. To do this, a script is executed each hour using cron.hourly that checks for new jobs or updated configuration and commits them up to version control.

The CD pipeline that we have built for the Manatee application enables any change in the application, infrastructure, database or configuration to move through to production seamlessly using automation. This allows any new features, security fixes, etc. to be fully tested as it gets delivered to production at the click of a button.

In the next part of our series – which is all about using CloudFormation – we’ll go through a CloudFormation template used to automate the creation of a Jenkins environment. In this next article, you’ll see how CloudFormation procures AWS resources and provisions our Jenkins CD Pipeline environment.

Continuous Delivery in the Cloud Case Study for the Sea to Shore Alliance – Introduction (part 1 of 6)

We help companies deliver software reliably and repeatedly using Continuous Delivery in the Cloud. With Continuous Delivery (CD), teams can deliver new versions of software to production by flattening the software delivery process and decreasing the cycle time between an idea and usable software through the automation of the entire delivery system: build, deployment, test, and release. CD is enabled through a delivery pipeline. With CD, our customers can choose when and how often to release to production. On top of this, we utilize the cloud so that customers can scale their infrastructure up and down and deliver software to users on demand.

Stelligent offers a solution called Elastic Operations which provides a Continuous Delivery platform along with expert engineering support and monitoring of a delivery pipeline that builds, tests, provisions and deploys software to target environments – as often as our customers choose. We’re in the process of open sourcing the platform utilized by Elastic Operations.

In this six-part blog series, I am going to go over how we built out a Continuous Delivery solution for the Sea to Shore Alliance:

Part 1: Introduction – What you’re reading now;
Part 2: CD Pipeline – Automated pipeline to build, test, deploy, and release software continuously;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

This year, we delivered this Continuous Delivery in the Cloud solution to the Sea to Shore Alliance. The Sea to Shore Alliance is a non-profit organization whose mission is to protect and conserve the world’s fragile coastal ecosystems and its endangered species such as manatees, sea turtles, and right whales. One of their first software systems tracks and monitors manatees. Prior to Stelligent‘s involvement, the application was running on a single instance that was manually provisioned and deployed. As a result of the manual processes, there were no automated tests for the infrastructure or deployment. This made it impossible to reproduce environments or deployments the same way every time. Moreover, the knowledge to recreate these environments, builds and deployments were locked in the heads of a few key individuals. The production application for tracking these Manatees, developed by Sarvatix, is located here.

In this case study, I describe how we went from an untested manual process in which the development team was manually building software artifacts, creating environments and deploying, to a completely automated delivery pipeline that is triggered with every change.

Figure 1 illustrates the AWS architecture of the infrastructure that we designed for this Continuous Delivery solution.

There are two CloudFormation stacks being used, the Jenkins stack – or Jenkins environment – as shown on the left and the Manatee stack – or Target environment – as shown on the right.

The Jenkins Stack

  1. * Creates the jenkins.example.com Route53 Hosted Zone
  2. * Creates an EC2 instance with Tomcat and Jenkins installed and configured on it.
  3. * Runs the CD Pipeline

The Manatee stack is slightly different, it utilizes the configuration provided by SimpleDB to create itself. This stack defines the target environment for which the application software is deployed.

The Manatee Stack

  1. * Creates the manatee.example.com Route53 Hosted Zone
  2. * Creates an EC2 instance with Tomcat, Apache, PostgreSQL installed on it.
  3. * Runs the Manatee application.

The Manatee stack is configured with CPU alarms that send an email notification to the developers/administrators when it becomes over-utilized. We’re in the process of scaling to additional instances when these types of alarms are triggered.

Both instances are encapsulated behind a security group so that they can talk between each other using the internal AWS network.

Fast Facts
Industry: Non-Profit
Profile: Customer tracks and monitors endangered species such as manatees.
Key Business Issues: The customer’s development team needed to have unencumbered access to resources along with automated environment creation and deployment.
Stakeholders: Development team and scientists and others from the Sea to Shore Alliance
Solution: Continuous Delivery in the Cloud (Elastic Operations)
Key Tools/Technologies: AWS – Amazon Web Services (CloudFormation, EC2, S3, SimpleDB, IAM, CloudWatch, SNS), Jenkins, Capistrano, Puppet, Subversion, Cucumber, Liquibase

The Business Problem
The customer needed an operations team that could be scaled up or down depending on the application need. The customer’s main requirements were to have unencumbered access to resources such as virtual hardware. Specifically, they wanted to have the ability to create a target environment and run an automated deployment to it without going to a separate team and submitting tickets, emails, etc. In addition to being able to create environments, the customer wanted to have more control over the resources being used; they wanted to have the ability to terminate resources if they were unused. To address these requirements we introduced an entirely automated solution which utilizes the AWS cloud for providing resources on-demand, along with other solutions for providing testing, environment provisioning and deployment.

On the Manatee project, we have five key objectives for the delivery infrastructure. The development team should be able to:

  1. * Deliver new software or updates to users on demand
  2. * Reprovision target environment configuration on demand
  3. * Provision environments on demand
  4. * Remove configuration bottlenecks
  5. * Ability for users to terminate instances

Our Team
Stelligent’s team consisted of an account manager and one polyskilled DevOps Engineer that built, managed, and supported the Continuous Delivery pipeline.

Our Solution
Our solution, a single delivery pipeline that gives our customer (developers, testers, etc.) unencumbered access to resources and a single click automated deployment to production. To enable this, the pipeline needed to include:

  1. * The ability for any authorized team member to create a new target environment using a single click
  2. * Automated deployment to the target environment
  3. * End-to-end testing
  4. * The ability to terminate unnecessary environments
  5. * Automated deployment into production with a single click

The delivery pipeline improves efficiency and reduces costs by not limiting the development team. The solution includes:

  • On-Demand Provisioning – All hardware is provided via EC2’s virtual instances in the cloud, on demand. As part of the CD pipeline, any authorized team member can use the Jenkins CreateTargetEnvironment job to order target environments for development work.
  • Continuous Delivery Solution so that the team can deliver software to users on demand:
  • Development Infrastructure – Consists of:
    • Tomcat: used for hosting the Manatee Application
    • Apache: Hosted the front-end website and used virtual hosts for proxying and redirection.
    • PostgreSQL: Database for the Manatee application
    • Groovy: the application is written in Grails which uses Groovy.
  • Instance Management – Any authorized team member is able to monitor virtual instance usage by viewing Jenkins. There is a policy that test instances are automatically terminated every two days. This promotes ephemeral environments and test automation.
  • Deployment to Production – There’s a boolean value (i.e. a checkbox the user selects) in the delivery pipeline used for deciding whether to deploy to production.
  • System Monitoring and Disaster Recovery – Using the AWS CloudWatch service, AWS provides us with detailed monitoring to notify us of instance errors or anomalies through statistics such as CPU utilization, Network IO, Disk utilization, etc. Using these solutions we’ve implemented an automated disaster recovery solution.


A list of the AWS tools we utilized are enumerated below.

Tool: AWS EC2
What is it? Cloud-based virtual hardware instances
Our Use: We use EC2 for all of our virtual hardware needs. All instances, from development to production are run on EC2

Tool: AWS S3
What is it? Cloud-based storage
Our Use: We use S3 as both a binary repository and a place to store successful build artifacts.

Tool:  AWS IAM
What is it? User-based access to AWS resources
Our Use: We create users dynamically and use their AWS access and secret access keys so we don’t have to store credentials as properties

Tool: AWS CloudWatch
What is it? System monitoring
Our Use: Monitors all instances in production. If an instance takes an abnormal amount of strain or shuts down unexpectedly, SNS sends an email to designated parties

Tool: AWS SNS
What is it? Email notifications
Our Use: When an environment is created or a deployment is run, SNS is used to send notifications to affected parties.

Tool: Cucumber
What is it? Acceptance testing
Our Use: Cucumber is used for testing at almost every step of the way. We use Cucumber to test infrastructure, deployments and application code to ensure correct functionality. Cucumber’s unique english-ess  verbiage allows both technical personnel and customers to communicate using an executable test.

Tool: Liquibase
What is it? Automated database change management
Our Use: Liquibase is used for all database changesets. When a change is necessary within the database, it is made to a liquibase changelog.xml

Tool: AWS CloudFormation
What is it? Templating language for orchestrating all AWS resources
Our Use: CloudFormation is used for creating a fully working Jenkins environment and Target environment. For instance for the Jenkins environment it creates the EC2 instance with CloudWatch monitoring alarms, associated IAM user, SNS notification topic, everything required for Jenkins to build. This along with Jenkins are the major pieces of the infrastructure.

Tool: AWS SimpleDB
What is it? Cloud-based NoSQL database
Our Use: SimpleDB is used for storing dynamic property configuration and passing properties through the CD Pipeline. As part of the environment creation process, we store multiple values such as IP addresses that we need when deploying the application to the created environment.

Tool: Jenkins
What is it? We’re using Jenkins to implement a CD pipeline using the Build Pipeline plugin.
Our Use: Jenkins runs the CD pipeline which does the building, testing, environment creation and deploying. Since the CD pipeline is also code (i.e. configuration code), we version our Jenkins configuration.

Tool: Capistrano
What is it? Deployment automation
Our Use: Capistrano orchestrates and automates deployments. Capistrano is a Ruby-based deployment DSL that can be used to deploy to multiple platforms including Java, Ruby and PHP. It is called as part of the CD pipeline and deploys to the target environment.

Tool: Puppet
What is it? Infrastructure automation
Our Use: Puppet takes care of the environment provisioning. CloudFormation requests the environment and then calls Puppet to do the dynamic configuration. We configured Puppet to install, configure, and manage the packages, files and services.

Tool: Subversion
What is it? Version control system
Our Use: Subversion is the version control repository where every piece of the Manatee infrastructure is stored. This includes the environment scripts such as the Puppet modules, the CloudFormation templates, Capistrano deployment scripts, etc.

We applied the on-demand usability of the cloud with a proven continuous delivery approach to build an automated one click method for building and deploying software into scripted production environments.

In the blog series, I will describe the technical implementation of how we went about building this infrastructure into a complete solution for continuously delivering software. This series will consist of the following:

Part 2 of 6 – CD Pipeline: I will go through the technical implementation of the CD pipeline using Jenkins. I will also cover Jenkins versioning, pulling and pushing artifacts from S3, and Continuous Integration.

Part 3 of 6 – CloudFormation: I will go through a CloudFormation template we’re using to orchestrate the creation of AWS resources and to build the Jenkins and target infrastructure.

Part 4 of 6 – Dynamic Configuration: Will cover dynamic property configuration using SimpleDB

Part 5 of 6 – Deployment Automation: I will explain Capistrano in detail along how we used Capistrano to deploy build artifacts and run Liquibase database changesets against target environments

Part 6 of 6 – Infrastructure Automation: I will describe the features of Puppet in detail along with how we’re using Puppet to build and configure target environments – for which the software is deployed.