Building an Autodiscovering Apache Zookeeper Cluster in AWS using Packer, Ansible and Terraform

Following my pattern of building AMIs for applications, I create my Apache Zookeeper cluster with Packer for my AMI and Terraform for the infrastructure. This Zookeeper cluster is auto-discovering of the other nodes that are determined to be in the cluster

###Building Zookeeper AMIs with Packer

The packer template looks as follows:

{
  "variables": {
    "ami_id": "",
    "private_subnet_id": "",
    "security_group_id": "",
    "packer_build_number": "",
  },
  "description": "Zookeeper Image",
  "builders": [
    {
      "ami_name": "zookeeper-{{user `packer_build_number`}}",
      "availability_zone": "eu-west-1a",
      "iam_instance_profile": "app-server",
      "instance_type": "t2.small",
      "region": "eu-west-1",
      "run_tags": {
        "role": "packer"
      },
      "security_group_ids": [
        "{{user `security_group_id`}}"
      ],
      "source_ami": "{{user `ami_id`}}",
      "ssh_timeout": "10m",
      "ssh_username": "ubuntu",
      "subnet_id": "{{user `private_subnet_id`}}",
      "tags": {
        "Name": "zookeeper-packer-image"
      },
      "type": "amazon-ebs"
    }
  ],
  "provisioners": [
    {
      "type": "shell",
      "inline": [ "sleep 10" ]
    },
    {
      "type": "shell",
      "script": "install_dependencies.sh",
      "execute_command": "echo '' | {{ .Vars }} sudo -E -S sh '{{ .Path }}'"
    },
    {
      "type": "ansible-local",
      "playbook_file": "zookeeper.yml",
      "extra_arguments": [
        "--module-path=./modules"
      ],
      "playbook_dir": "../../"
    }
  ]
}

The install_dependencies.sh script is as described previously

The ansible playbook for Zookeeper looks as follows:

- hosts: all
  sudo: yes

  pre_tasks:
    - ec2_tags:
    - ec2_facts:

  roles:
    - base
    - zookeeper
    - exhibitor

The base playbook installs a base role for all the base pieces of my system (e.g. Logstash, Sensu-client, prometheus node_exporter) and then proceeds to install zookeeper. As a last step, I install exhibitor. Exhibitor is a co-process for monitoring, backup/recovery, cleanup and visualization of zookeeper.

The zookeeper ansible role looks as follows:

- name: Download ZooKeeper
  get_url: url=http://www.mirrorservice.org/sites/ftp.apache.org/zookeeper/current/zookeeper-{{ zookeeper_version }}.tar.gz dest=/tmp/zookeeper-{{ zookeeper_version }}.tar.gz mode=0440

- name: Unpack Zookeeper
  command: tar xzf /tmp/zookeeper-{{ zookeeper_version }}.tar.gz -C /opt/ creates=/opt/zookeeper-{{ zookeeper_version }}

- name: Link to Zookeeper Directory
  file: src=/opt/zookeeper-{{ zookeeper_version }}
        dest=/opt/zookeeper
        state=link
        force=yes

- name: Create zookeeper group
  group: name=zookeeper system=true state=present

- name: Create zookeeper user
  user: name=zookeeper groups=zookeeper system=true home=/opt/zookeeper

- name: Create Zookeeper Config Dir
  file: path={{zookeeper_conf_dir}} owner=zookeeper group=zookeeper recurse=yes state=directory mode=0644

- name: Create Zookeeper Transations Dir
  file: path=/opt/zookeeper/transactions owner=zookeeper group=zookeeper recurse=yes state=directory mode=0644

- name: Create Zookeeper Log Dir
  file: path={{zookeeper_log_dir}} owner=zookeeper group=zookeeper recurse=yes state=directory mode=0644

- name: Create Zookeeper DataStore Dir
  file: path={{zookeeper_datastore_dir}} owner=zookeeper group=zookeeper recurse=yes state=directory mode=0644

- name: Setup log4j
  template: dest="{{zookeeper_conf_dir}}/log4j.properties" owner=root group=root mode=644 src=log4j.properties.j2

The role itself is very simple. The zookeeper cluster is managed by exhibitor so there are very few settings passed to zookeeper at this point. One thing to note though, this requires an installation of the Java JDK to work.

The exhibitor playbook looks as follows:

- name: Install Maven
  apt: pkg=maven state=latest update_cache=yes

- name: Create Exhibitor Install Dir
  file: path={{ exhibitor_install_dir }} state=directory mode=0644

- name: Create Exhibitor Build Dir
  file: path={{ exhibitor_install_dir }}/{{ exhibitor_version }} state=directory mode=0644

- name: Create Exhibitor POM
  template: src=pom.xml.j2
            dest={{ exhibitor_install_dir }}/{{ exhibitor_version }}/pom.xml

- name: Build Exhibitor jar
  command: '/usr/bin/mvn clean package -f {{ exhibitor_install_dir }}/{{ exhibitor_version }}/pom.xml creates={{ exhibitor_install_dir }}/{{ exhibitor_version }}/target/exhibitor-{{ exhibitor_version }}.jar'

- name: Copy Exhibitor jar
  command: 'cp {{ exhibitor_install_dir }}/{{ exhibitor_version }}/target/exhibitor-{{ exhibitor_version }}.jar {{exhibitor_install_dir}}/exhibitor-standalone-{{ exhibitor_version }}.jar creates={{exhibitor_install_dir}}/exhibitor-standalone-{{ exhibitor_version }}.jar'

- name: Exhibitor Properties Config
  template: src=exhibitor.properties.j2
            dest={{ exhibitor_install_dir }}/exhibitor.properties

- name: exhibitor upstart config
  template: src=upstart.j2 dest=/etc/init/exhibitor.conf mode=644 owner=root

- service: name=exhibitor state=started

This role has a lot more configuration to set as it essentially manages zookeeper. The template files for configuration are all available to download.

The variables for the entire playbook look as follows:

zookeeper_hosts: "{{ansible_fqdn}}:2181"
zookeeper_version: 3.4.6
zookeeper_conf_dir: /etc/zookeeper/conf
zookeeper_log_dir: /var/log/zookeeper
zookeeper_datastore_dir: /var/lib/zookeeper
zk_s3_bucket_name: "mys3bucket"
monasca_log_level: WARN
exhibitor_version: 1.5.5
exhibitor_install_dir: /opt/exhibitor

The main thing to note here is that the exhibitor process starts with the following configuration:

exec java -jar {{ exhibitor_install_dir }}/exhibitor-standalone-{{exhibitor_version}}.jar --port 8181 --defaultconfig /opt/exhibitor/exhibitor.properties --configtype s3 --s3config {{ zk_s3_bucket_name }}:{{ ansible_ec2_placement_region }} --s3backup true --hostname {{ ec2_private_ip_address }} > /var/log/exhibitor.log 2>&1

This means that the node will check itself into a configuration file in S3 and that all other zookeepers will read the same configuration file and can form the cluster required. You can read more about Exhibitor shared configuration on their github wiki.

When I launch the instances now, the zookeeper cluster will be formed

###Deploying Zookeeper Infrastructure with Terraform

The infrastructure of the Zookeeper cluster is pretty simple:

resource "aws_security_group" "zookeeper" {
  name = "digit-zookeeper-sg"
  description = "Zookeeper Security Group"
  vpc_id = "${aws_vpc.default.id}"

  ingress {
    from_port = 0
    to_port   = 0
    protocol  = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port = "0"
    to_port = "0"
    protocol = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags {
    Name = "ZooKeeper Node"
  }
}

resource "aws_launch_configuration" "zookeeper_launch_config" {
  image_id = "${var.zookeeper_ami_id}"
  instance_type = "${var.zookeeper_instance_type}"
  iam_instance_profile = "zookeeper-server"
  key_name = "${aws_key_pair.terraform.key_name}"
  security_groups = ["${aws_security_group.zookeeper.id}","${aws_security_group.node.id}"]
  enable_monitoring = false

  lifecycle {
    create_before_destroy = true
  }

  root_block_device {
    volume_size = "${var.digit_zookeeper_volume_size}"
  }
}

resource "aws_autoscaling_group" "zookeeper_autoscale_group" {
  name = "zookeeper-autoscale-group"
  availability_zones = ["${aws_subnet.primary-private.availability_zone}","${aws_subnet.secondary-private.availability_zone}","${aws_subnet.tertiary-private.availability_zone}"]
  vpc_zone_identifier = ["${aws_subnet.primary-private.id}","${aws_subnet.secondary-private.id}","${aws_subnet.tertiary-private.id}"]  launch_configuration = "${aws_launch_configuration.zookeeper_launch_config.id}"
  min_size = 0
  max_size = 100
  desired = 3

  tag {
    key = "Name"
    value = "zookeeper"
    propagate_at_launch = true
  }

  tag {
    key = "role"
    value = "zookeeper"
    propagate_at_launch = true
  }
}

When Terraform is applied here, a 3 node cluster of zookeeper will be created. You can go to exhibitor and see the configuration e.g.

Image Image