This article should be considered deprecated since it speaks about the old (standalone) Swarm. To get more up-to-date information about the new Swarm mode, please read the Docker Swarm Introduction (Tour Around Docker 1.12 Series) article or consider getting the The DevOps 2.1 Toolkit: Docker Swarm book.
This series is split into following articles.
- A Taste of What Is To Come
- Manually Deploying Services
- Blue-Green Deployment, Automation and Self-Healing Procedure
- Scaling Individual Services
Previous articles put a lot a focus on Continuous Delivery and Containers with Docker. In Continuous Integration, Delivery or Deployment with Jenkins, Docker and Ansible I explained how to continuously build, test and deploy micro services packaged into containers and do that across multiple servers, without downtime and with the ability to rollback. We used Ansible, Docker, Jenkins and few other tools to accomplish that goal.
Now it's time to extend what we did in previous articles and scale services across any number of servers. We'll treat all servers as one server farm and deploy containers not to predefined locations but to those that have the least number of containers running. Instead of thinking about each server as an individual place where we deploy, we'll treat all of them as one unit.
We'll continue using some of the same tools we used before.
- Vagrant with VirtualBox will provide an easy way to create and configure lightweight, reproducible, and portable virtual machines that will act as our servers.
- Docker will provide an easy way to build, ship, and run distributed applications packaged in containers.
- Ansible will be used to setup servers and deploy applications.
- We'll use Jenkins to detect changes to our code repositories and trigger jobs that will test, build and deploy applications.
- Finally, nginx will provide proxy to different servers and ports our micro services will run on.
On top of those we'll see some new ones like following.
- Docker Compose is a handy tool that will let us run containers.
- Docker Swarm will turn a pool of servers into a single, virtual host.
- Finally, we'll use Consul for service discovery and configuration.
Some users reported difficulties running Ansible inside Vagrant VM on Windows. Seems that the problem exists only on older versions of Vagrant. Please make sure that the latest Vagrant is installed.
With prerequisites out of our way, we're ready to start building our server farm.
We'll create four virtual machines. One (swarm-master) will be used to orchestrate deployments. Its primary function is to act as Docker Swarm master node. Instead of deciding in advance where to deploy something, we'll tell Docker Swarm what to deploy and it will deploy it to a node that has the least number containers running. There are other strategies that we could employ but, as a demonstration, this default one should suffice. Besides Swarm, we'll also set up Ansible, Jenkins, Consul and Docker Compose on that same node. Three additional virtual machines will be created and named swarm-node-01, swarm-node-02 and swarm-node-03. Unlike swarm-master, those nodes will have only Consul and Swarm agents. Their purpose is to host our services (packed as Docker containers). Later on, if we need more hardware, we would just add one more node and let Swarm take care of balancing deployments.
We'll start by bringing up Vagrant VMs1. Keep in mind that four virtual machines will be created and that each of them requires 1GB of RAM. On a 8GB 64 bits computer, you should have no problem running those VMs. If you don't have that much memory to spare, please try edit the Vagrantfile by changing
v.memory = 1024 to some smaller value.
All the code is located in the vfarcic/docker-swarm GitHub repository.
We can set up all the servers by running infra.yml Ansible playbook.
First time you run Ansible against one server, it will ask you whether you want to continue connecting. Answer with yes.
A lot of things will be downloaded (Jenkins container being the biggest) and installed with this command so be prepared to wait for a while. I won't go into details regarding Ansible. You can find plenty of articles about it both in the official site as well as in other posts on this blog. Important detail is that, once the execution of the Ansible playbook is done, swarm-master will have Jenkins, Consul, Docker Compose and Docker Swarm installed. The other three nodes received instructions to install only Consul and Swarm agents. For more information please consult the Continuous Integration, Delivery or Deployment with Jenkins, Docker and Ansible and other articles in Continuous Integration Delivery and Deployment.
Throughout this article, we will never enter any of the swarm-node servers. Everything will be done from the single location (swarm-master).
Now let us go through few of the tools we haven't used before in this blog; Consul and Docker Swarm.
Consul is a tool aimed at easy service discovery and configuration of distributed and highly available data centers. It also features easy to set up failure detection and key/value storage.
Let us take a look at Consul that was installed on all machines.
For example, we can see all members of our cluster with the following command.
The output should be something similar to the following.
With Consul running everywhere we have the ability to store information about applications we deploy and have it propagated to all servers. That way, applications store data locally and do not have to worry about location of a central server. At the same time, when an application needs information about others, it can also request it locally. Being able to propagate information across all servers is an essential requirement for all distributed systems.
Another way to retrieve the same information is through Consul's REST API. We can run following command.
This produces following JSON output formatted with jq.
Later on, when we deploy the first application, we'll see Consul in more detail. Please take note that even though we'll use Consul by running commands from Shell (at least until we get to health section), it has an UI that can be accessed by opening http://10.100.199.200:8500.
Docker Swarm allows us to leverage standard Docker API to run containers in a cluster. the easiest way to use it is to set the DOCKER_HOST environment variable. Let's run Docker command info.
The output should be similar to the following.
We get immediate information regarding number of deployed containers (at the moment 9), strategy Swarm uses to distribute them (spread; runs on a server with the least number of running containers), number of nodes (servers) and additional details for each of them. At the moment, each server has one Swarm Agent and two Consul Registrators deployed (nine in total). All those deployments were done as part of the infra.yml playbook that we run earlier.
Let us deploy the first service. We'll use Ansible playbook defined in books-service.yml.
Running this, or any other playbook from this article is slow because we're pulling images to all nodes, not only the one we'll deploy to. The reason behind this is that, in case a node fails, we want to have everything ready for as fast as possible re-deployment to a different node. Good news is that next time you run it, it will be much faster since images are already downloaded and Docker will only pull differences.
The playbook that we just run follows the same logic as the one we already discussed in the blue/green deployment article. The major difference is that this time there are few things that are unknown to us before playbook is actually run. We don't know the IP of the node service will be deployed to. Since the idea behind this setup is not only to distribute applications between multiple nodes but also to scale them effortlessly, port is also unknown. If we'd define it in advance there would be a probable danger that multiple services would use the same port and clash.
Right now we'll go through what this playbook does and, later on in the next article, we'll explore how it was done.
Books service consists of two containers. One is the application itself and the other contains MongoDB that the application needs. Let's see where they deployed to.
The result will differ from case to case and it will look similar to the following. Docker ps command will output more information than presented below. Those that are not relevant for this article were removed.
We can see that the application container was deployed to swarm-node-03 and is listening the port 32768. Database, on the other hand, went to a separate node swarm-node-01 and listens to the port 32768. The purpose of the books service is to store and retrieve books from the Mongo database.
Let's check whether those two containers communicate with each other. When we request data from the application container (booksservice_blue_1) it will retrieve it from the database (booksservice_db_1). In order to test it we'll request service to insert few books and then ask it to retrieve all store records.
The result of the last request is following.
All three books that we requested the service to put to its database were stored. You might have noticed that we did not perform requests to the IP/port where the application is running. Instead of doing
curl against 10.100.199.203:32768 (this is where the service is currently running) we performed requests to 10.100.199.200 on the standard HTTP port 80. That's where our nginx server is deployed and, through the "magic" of Consul, Registrator and Templating, nginx was updated to point to the correct IP and port. Details of how this happened are explained in the next article. For now, it is important to know that data about our application is stored in Consul and freely accessible to every service that might need it. In this case, that service is nginx that acts as a reverse proxy and load balancer at the same time.
To prove this, let's run the following.
Since we'll practice blue/green deployment, name of the service is alternating between books-service-blue and books-service-green. This is the first time we deployed it so the name is blue. The next deployment will be green, than blue again and so on.
We also have the information stored as books-service (generic one, without blue or green) with IP and port that should be accessible to public.
Unlike previous outputs that can be different from case to case (IPs and ports are changing from deployment to deployment), this output should always be the same.
No matter where we deploy our services, they are always accessible from a single location 10.100.199.200 (at least until we start adding multiple load balancers) and are always accessible from the default HTTP port 80. nginx will make sure that requests are sent to the correct service on the correct IP and port.
We can deploy another service using the same principle. This time it will be a front-end for our books-service.
You can see the result by opening http://10.100.199.200 in your browser. It's an AngularJS UI that uses the service we deployed earlier to retrieve all the available books. As with the books-service, you can run following to see where the container was deployed.
The output of both commands should be similar to the following.
Now let us imagine that someone changed the code of the books-service and that we want to deploy a new release. The procedure is exactly the same as we did before.
To verify that everything went as expected we can query Consul.
The output should be similar to the following.
While blue release was on IP 10.100.199.203, this time the container was deployed to 10.100.199.202. Docker Swarm checked which server had the least number of containers running and decided that the best place to run it is swarm-node-02.
You might have guessed that, at the beginning, it's easy to know whether we deployed blue or green. However, we'll loose track very fast with increased number of deployments and services. We can solve this by querying Consul keys.
Values in Consul are stored in base64 encoding. To see only the value, run following.
The output of the command is
The only thing missing for fully implemented Continuous Deployment is to have something that will detect changes to our source code repository and then build, test and deploy containers. With Docker it's very easy to have all builds, testing and deployments follow the same standard. For this article, I created only jobs that do the actual deployment. We'll use them later in the next article when we explore ways to recuperate from failure. Until then, you can take a look at the running Jenkins instance by opening http://10.100.199.200:8080/.
To Be Continued
In the next article we're exploring additional features of Consul and how we can utilize them to recuperate from failures. Whenever some container stops working, Consul will detect it and send a request to Jenkins which, in turn, will redeploy the failed container. While Jenkins jobs created for this article only deploy services, you could easily extend them to, let's say, send an email when the request comes from Consul indicating a failure.
We'll go in-depth how all this was accomplished and show manual commands with Docker Compose, Consul-Template, Registrator, etc. Their understanding is a prerequisites for explanation of Ansible playbooks that we saw (and run) earlier.
Finally, we'll explore how we could scale multiple instances of same applications.
You got the taste of what and now it's time to understand how.
The story continues in the Manually Deploying Services article.
The DevOps 2.0 Toolkit
If you liked this article, you might be interested in The DevOps 2.0 Toolkit: Automating the Continuous Deployment Pipeline with Containerized Microservices book.
This book is about different techniques that help us architect software in a better and more efficient way with microservices packed as immutable containers, tested and deployed continuously to servers that are automatically provisioned with configuration management tools. It's about fast, reliable and continuous deployments with zero-downtime and ability to roll-back. It's about scaling to any number of servers, design of self-healing systems capable of recuperation from both hardware and software failures and about centralized logging and monitoring of the cluster.
In other words, this book envelops the whole microservices development and deployment lifecycle using some of the latest and greatest practices and tools. We'll use Docker, Kubernetes, Ansible, Ubuntu, Docker Swarm and Docker Compose, Consul, etcd, Registrator, confd, Jenkins, and so on. We'll go through many practices and, even more, tools.
If you run into issues with Ansible complaining about executable permissions, try modifying the
config.vm.synced_folder ".", "/vagrant"to
config.vm.synced_folder ".", "/vagrant", mount_options: [“dmode=700,fmode=600″]↩