HAProxy on AWS

Deployment of HAProxy LoadBalancer on AWS using Ansible Roles

Saptarsiroy
6 min readFeb 23, 2021

--

Well, this practical’s gonna be a little much high-level and conceptual, so at first, let’s have a look at the concepts and ansible features that would be required further in the course of work.

LoadBalancing:

In computing, load balancing refers to the process of distributing a set of tasks over a set of resources, with the aim of making their overall processing more efficient. A load balancing algorithm is “static” when it does not take into account the state of the system for the distribution of tasks. Unlike static load distribution algorithms, dynamic algorithms take into account the current load of each of the computing units (also called nodes) in the system.
HAProxy (High Availability Proxy) is a TCP/HTTP load balancer and proxy server that allows a webserver to spread incoming requests across multiple endpoints. This is useful in cases where too many concurrent connections over-saturate the capability of a single server.

Webservers:

A web server is server software, or a system of one or more computers dedicated to running this software, that can satisfy client HTTP requests on the public World Wide Web or also on private LANs and WANs. On a web server, the HTTP server is responsible for processing and answering incoming requests. Upon receiving a request, an HTTP server first checks if the requested URL matches an existing file. If so, the web server sends the file content back to the browser. If not, an application server builds the necessary file.

Dynamic Inventory:

A static inventory file is a plain text file containing a list of managed hosts or remote nodes whose numbers and IP addresses remain fairly constant. On the other hand, a dynamic host file keeps changing as you add new hosts or decommission old ones. The dynamic inventory script can do anything to get the data (call an external API, pull information from a database or file, etc.), and Ansible will use it as an inventory source.

Ansible roles:

Ansible role is a set of tasks to configure a host to serve a certain purpose like configuring a service. Roles are defined using YAML files with a predefined directory structure. A role directory structure contains directories: defaults, vars, tasks, files, templates, meta, handlers. Ansible Galaxy is a repository for Ansible Roles that are available to drop directly into our Playbooks to streamline our automation projects.

Done with the theoretical part of the practical, we now are good to go with jumping into the practical implementation of our problem statement as stated below:-

Provision EC2 instances through ansible.

Retrieve the IP Address of instances using the dynamic inventory concept.

Configure the web servers through the ansible role.

Configure the load balancer through the ansible role.

Note-1: Pre-requisite to the practical — Basic knowledge on usage of Ansible and Ansible installed in the system.

Note-2: For ease in the practical, we would be performing it in a single workspace folder (in this case, the workspace is at /root/ansible-task-3).

STEP-1:
Setup the configuration file for ansible.

ansible.cfg

STEP-2:
Configure the dynamic inventory for Ansible.
As already discussed, the task of dynamic inventory is to fetch the IP of the launched instances on the way and append it into the Ansible inventory. To cater to this need, we would be requiring a jinja file that contains a generalized jinja code for fetching instance IPs. The extension of jinja files is .j2.

Jinja file for dynamic inventory

STEP-3:
Setup the configuration file for HAProxy loadbalancing.
For this, we first need to install haproxy software in local machine using the command:

yum install haproxy -y

The configuration file for haproxy, i.e., haproxy.cfg can be found in the location /etc/haproxy. Following changes need to be made in the same:-

  • Rename the file as haproxy.cfg.j2 using the command
mv haproxy.cfg haproxy.cfg.j2
  • Make the following changes (as highlighted) accordingly to the file.
changes to haproxy.cfg file

Let’s dive into Ansible now!

STEP-4:
Generate keys and AWS credentials
Since the instances would be launched on AWS, Ansible would require to reach out to AWS in order to request resources from it. So, the credentials generated for the AWS user account to be used would be needed along with a key file to login into the instances as and when required.
Now, AWS credentials are too private to an AWS user and should never be revealed to the outer world. So, we cannot hardcode them in our playbook; as a result we would use password-protected Ansible vaults to encode the secret key and access key for AWS.

ansible-vault create keys.yml

We can store the vault password in a file for future use.

echo "<your_vault_password>" >> passwd

STEP-5:
Create ansible roles

Role-1: Webserver
This role would setup an httpd webserver with a minimal web content on instances tagged as “webserver”

ansible-galaxy role init webserver

Now, following is to be written in webserver/tasks/main.yml.

main.yml for webserver role

Role-2: Loadbalancer
This role would setup a loadbalancing program using the HAProxy software on the instance tagged as “loadbalancer”, with the backend webserver instances.

ansible-galaxy role init loadbalancer

Following is to be written in loadbalancer/tasks/main.yml.

main.yml for loadbalancer role

STEP-6:
Create the ansible playbook.
Now, we are good to go with writing the full ansible playbook that would setup the full architecture with just one command run!

#aws-webserver.yml
- name: Launch webserver instance on aws
hosts: localhost
gather_facts: false
vars_files:
- keys.yml
tasks:
- name: Launch webservers instance
ec2:
assign_public_ip: yes
vpc_subnet_id: subnet-2d4d4845
image: ami-08f63db601b82ff5f
aws_secret_key: "{{ secret_key }}"
aws_access_key: "{{ access_key }}"
exact_count: 3
count_tag: {"Name": "webserver"}
instance_type: t2.micro
instance_tags: {"Name": "webserver"}
key_name: ansible
group_id: sg-03ef9242dfd8baf26
region: ap-south-1
wait: yes
register: web
- name: Launch Loadbalancer instance
ec2:
assign_public_ip: yes
vpc_subnet_id: subnet-2d4d4845
image: ami-08f63db601b82ff5f
aws_secret_key: "{{ secret_key }}"
aws_access_key: "{{ access_key }}"
exact_count: 1
count_tag: {"Name": "loadbalancer"}
instance_type: t2.micro
instance_tags: {"Name": "loadbalancer"}
key_name: ansible
group_id: sg-03ef9242dfd8baf26
region: ap-south-1
wait: yes
register: lb
- name: Get dynamic inventory
template:
src: inventory.j2
dest: inventory
- name: Write instance ip to loadbalancer
template:
src: haproxy.cfg.j2
dest: haproxy.cfg
- name: Refresh inventory
meta: refresh_inventory
- name: Install webserver
hosts: webserver
gather_facts: yes
roles:
- role: webserver
- name: Install loadbalancer
hosts: loadbalancer
gather_facts: no
roles:
- role: loadbalancer

Command to run the playbook:-

ansible-playbook aws-webserver.yml --vault-password-file passwd

Successful running of the playbook would generate an output somewhat like the following —

Running aws-webserver.yml

On the GUI of AWS, we can view 4 ec2 instances launched out of which, 3 are tagged as “webserver” and 1 is tagged as “loadbalancer”.

Ec2 instances

Now, on browsing the public DNS of the loadbalancer instance on a new tab, we can view that our webserver has been successfully deployed as well as loadbalanced!

First IP
Second IP
Third IP

With this, we have successfully tackled the problem statement mentioned aforesaid.

That’s all as for now, thank you all :)

--

--