Docker Clustering Tools Compared: Kubernetes vs Docker Swarm

Kubernetes and Docker Swarm are probably two most commonly used tools to deploy containers inside a cluster. Both are created as helper tools that can be used to manage a cluster of containers and treat all servers as a single unit. However, they differ greatly in their approach.

Kubernetes

kubernetesKubernetes is based on Google’s experience of many years working with Linux containers. It is, in a way, a replica of what Google has been doing for a long time but, this time, adapted to Docker. That approach is great in many ways, most important being that they used their experience from the start. If you started using Kubernetes around Docker version 1.0 (or earlier), the experience with Kubernetes was great. It solved many of the problems that Docker itself had. We could mount persistent volumes that would allow us to move containers without loosing data, it used flannel to create networking between containers, it has load balancer integrated, it uses etcd for service discovery, and so on. However, Kubernetes comes at a cost. It uses a different CLI, different API and different YAML definitions. In other words, you cannot use Docker CLI nor you can use Docker Compose to define containers. Everything needs to be done from scratch exclusively for Kubernetes. It’s as if the tool was not written for Docker (which is partly true). Kubernetes brought clustering to a new level but at the expense of usability and steep learning curve.

Docker Swarm

docker-swarmDocker Swarm took a different approach. It is a native clustering for Docker. The best part is that it exposes standard Docker API meaning that any tool that you used to communicate with Docker (Docker CLI, Docker Compose, Dokku, Krane, and so on) can work equally well with Docker Swarm. That in itself is both an advantage and a disadvantage at the same time. Being able to use familiar tools of your own choosing is great but for the same reasons we are bound by the limitations of Docker API. If the API doesn’t support something, there is no way around it through Swarm API and some clever tricks need to be performed.

We’ll explore those two tools in more details based on their setup and features they provide for running containers in a cluster.

Setting Up

Setting up Docker Swarm is easy, straightforward and flexible. All we have to do is install one of the service discovery tools and run the swarm container on all nodes. Since the distribution itself is packed as a Docker container, it works in the same way no matter the operating system. We run the swarm container, expose a port and inform it about the address of the service discovery. It could hardly be easier than that. We can even start using it without any service discovery tool, see whether we like it and when our usage of it becomes more serious, add etcd, Consul or some of the other supported tools.

Kubernetes setup is quite more complicated and obfuscated. Installation instructions differ from OS to OS and provider to provider. Each OS or a hosting provider comes with its own set of instructions each of them having a separate maintenance team with a separate set of problems. As example, if you choose to try it out with Vagrant, you are stuck with Fedora. That does not mean that you cannot run it with Vagrant and, let’s say, Ubuntu or CoreOS. You can, but you need to start searching for instructions outside the official Kubernetes Getting Started page. Whatever your needs are, it’s likely that the community has the solution but you still need to spend some time searching for it and hoping that it works from the first attempt. The bigger problem is that the installation relies on a bash script. That would not be a big deal in itself if we would not live in the era where configuration management is a must. We might not want to run a script but make Kubernetes be part of our Puppet, Chef or Ansible definitions. Again, this can be overcome as well. You can find Ansible playbooks for running Kubernetes or you can write your own. None of those issues are a big problem but, when compared with Swarm, they are a bit painful. With Docker we were supposed not to have installation instructions (aside from a few docker run arguments). We were supposed to run containers. Swarm fulfils that promise and Kubernetes doesn’t.

While some might not care about which discovery tool is used, I love the simplicity behind Swarm and the logic “batteries included but removable”. Everything works out-of-the-box but we still have the option to substitute one component with the other. Unlike Swarm, Kubernetes is opinionated tool. You need to live with the choices it made for you. If you want to use Kubernetes, you have to use etcd. I’m not trying to say that etcd is bad (quite contrary) but if you prefer, for example, to use Consul you’re in a very complicated situation and would need to use one for Kubernetes and the other for the rest of your service discovery needs. Another thing I dislike about Kubernetes is its need to know things in advance, before the setup. You need to tell it the addresses of all your nodes, which role each of them has, how many minions there are in the cluster and so on. With Swarm, we just bring up a node and tell it to join the network. Nothing needs to be set in advance since the information about the cluster is propagated through gossip.

Set up might not be the most important difference between those tools. No matter which tool you choose, sooner or later everything will be up and running and you’ll forget any trouble you might have had during the process. You might say that we should not choose one tool over the other only because one is easier to set up. Fair enough. Let’s move on and speak about differences in how you define containers that should be run with those tools.

Running Containers

How do you define all the arguments needed for running Docker containers with Swarm? You don’t! Actually, you do but not in any form or way different from the way you were defining them before Swarm. If you are used to run containers through Docker CLI, you can continue using it with (almost) the same commands. If you prefer to use Docker Compose to run containers, you can continue using it to run them inside the Swarm cluster. Whichever way you’re used to run your containers, chances are that you can continue doing the same with Swarm but on a much larger scale.

Kubernetes requires you to learn its CLI and configurations. You cannot use docker-compose.yml definitions you created earlier. You’ll have to create Kubernetes equivalents. You cannot use Docker CLI commands you learned before. You’ll have to learn Kubernetes CLI and, likely, make sure that the whole organization learns it as well.

No matter which tool you choose for deployments to your cluster, chances are you are already familiar with Docker. You are probably already used to Docker Compose as a way to define arguments for the containers you’ll run. If you played with it for more than a few hours, you are using it as a substitute for Docker CLI. You run containers with it, tail their logs, scale them, and so on. On the other hand, you might be a hard-core Docker user who does not like Docker Compose and prefer running everything through Docker CLI or you might have your own bash scripts that run containers for you. No matter what you choose, it should work with Docker Swarm.

If you adopt Kubernetes, be prepared to have multiple definitions of the same thing. You will need Docker Compose to run your containers outside Kubernetes. Developers will continue needing to run containers on their laptops, your staging environments might or might not be a big cluster, and so on. In other words, once you adopt Docker, Docker Compose or Docker CLI are unavoidable. You have to use them one way or another. Once you start using Kubernetes you will discover that all your Docker Compose definitions (or whatever else you might be using) need to be translated to Kubernetes way of describing things and, from there on, you will have to maintain both. With Kubernetes everything will have to be duplicated resulting in higher cost of maintenance. And it’s not only about duplicated configurations. Commands you’ll run outside of the cluster will be different from those inside the cluster. All those Docker commands you learned and love will have to get their Kubernetes equivalents inside the cluster.

Guys behind Kubernetes are not trying to make your life miserable by forcing you to do things “their way”. The reason for such big differences is in a different approaches Swarm and Kubernetes are using to tackle the same problem. Swarm team decided to match their API with the one from Docker. As a result, we have (almost) full compatibility. Almost everything we can do with Docker we can do with Swarm as well only on a much larger scale. There’s nothing new to do, no configurations to be duplicated and nothing new to learn. No matter whether you use Docker CLI directly or go through Swarm, API is (more or less) the same. The negative side of that story is that if there is something you’d like Swarm to do and that something is not part of the Docker API, you’re in for a disappointment. Let us simplify this a bit. If you’re looking for a tool for deploying containers in a cluster that will use Docker API, Swarm is the solution. On the other hand, if you want a tool that will overcome Docker limitations, you should go with Kubernetes. It is power (Kubernetes) against simplicity (Swarm). Or, at least, that’s how it was until recently. But, I’m jumping ahead of myself.

The only question unanswered is what those limitations are. Two of the major ones were networking and persistent volumes. Until Docker Swarm release 1.0 we could not link containers running on different servers. Actually, we still cannot link them but now we have multi-host networking to help us connect containers running on different servers. It is a very powerful feature. Kubernetes used flannel to accomplish networking and now, since the Docker release 1.9, that feature is available as part of Docker CLI.

Another problem were persistent volumes. Docker introduced them in release 1.9. Until recently, if you persist a volume, that container was tied to the server that volume resides. It could not be moved around without, again, resorting to some nasty tricks like copying volume directory from one server to another. That in itself is a slow operation that defies the goals of tools like Swarm. Besides, even if you have time to copy a volume from one to the other server, you do not know where to copy since clustering tools tend to treat your whole datacenter as a single entity. You containers will be deployed to a location most suitable for them (least number of containers running, most CPUs or memory available, and so on). Now we have persistent volumes supported by Docker natively.

Both networking and persistent volumes problems were one of the features supported by Kubernetes for quite some time and the reason why many were choosing it over Swarm. That advantage disappeared with Docker release 1.9.

The Choice

When trying to make a choice between Docker Swarm and Kubernetes, think in following terms. Do you want to depend on Docker itself solving problems related to clustering. If you do, choose Swarm. If something is not supported by Docker it will be unlikely that it will be supported by Swarm since it relies on Docker API. On the other hand, if you want a tool that works around Docker limitations, Kubernetes might be the right one for you. Kubernetes was not built around Docker but is based on Google’s experience with containers. It is opinionated and tries to do things in its own way.

The real question is whether Kubernetes’ way of doing things, which is quite different from how we use Docker, is overshadowed by advantages it gives. Or, should we place our bets into Docker itself and hope that it will solve those problems? Before you answer those questions, take a look at the release 1.9. We got persistent volumes and software networking. We also got unless-stopped restart policy that will manage our unwanted failures. Now there are three things less of a difference between Kubernetes and Swarm. Actually, these days there are very few advantages Kubernetes has over Swarm. On the other hand, Swarm uses Docker API meaning that you get to keep all your commands and Docker Compose configurations. Personally, I’m placing my bets on Docker engine getting improvements and Docker Swarm running on top of it. The difference between the two is very small. Both are production ready but Swarm is easier to set up, easier to use and we get to keep everything we built before moving to the cluster; there is no duplication between cluster and non-cluster configurations.

My recommendation is to go with Docker Swarm. Kubernetes is too opinionated, hard to set up, too different from Docker CLI/API and at the same time it doesn’t have real advantages over Swarm since the Docker release 1.9. That doesn’t mean that there are no features available in Kubernetes that are not supported by Swarm. There are feature differences in both directions. However, those differences are, in my opinion, not major ones and the gap is getting smaller with each Docker release. Actually, for many use cases there is no gap at all while Docker Swarm is easier to set up, learn and use.

The DevOps 2.0 Toolkit

The DevOps 2.0 ToolkitIf you liked this article, you might be interested in The DevOps 2.0 Toolkit: Automating the Continuous Deployment Pipeline with Containerized Microservices book.

This book is about different techniques that help us architect software in a better and more efficient way with microservices packed as immutable containers, tested and deployed continuously to servers that are automatically provisioned with configuration management tools. It’s about fast, reliable and continuous deployments with zero-downtime and ability to roll-back. It’s about scaling to any number of servers, design of self-healing systems capable of recuperation from both hardware and software failures and about centralized logging and monitoring of the cluster.

In other words, this book envelops the whole microservices development and deployment lifecycle using some of the latest and greatest practices and tools. We’ll use Docker, Kubernetes, Ansible, Ubuntu, Docker Swarm and Docker Compose, Consul, etcd, Registrator, confd, Jenkins, and so on. We’ll go through many practices and, even more, tools.

The book is available from LeanPub and Amazon (Amazon.com and other worldwide sites).

42 thoughts on “Docker Clustering Tools Compared: Kubernetes vs Docker Swarm

    1. Ward K Harold

      We’ve used Kubernetes (k8s) in production since the 1.0 launch. Swarm wasn’t production ready so it wasn’t really a choice. We also appreciated that we were getting a decade of Google’s experience building and running clusters for free.

      Once we were running in production we realized that the particular container technology is actually a very small part of a cluster solution. The cluster services are the bread and butter. We could replace Docker with Rocket in our cluster and aside from a slightly different workflow for building images no one would notice. If we were tied to the Docker API that wouldn’t be possible.

      As for setup, while we got comfortable doing k8s installs both in the cloud and on local boxes, we ultimately decided to go with Google Container Engine, effectively outsourcing DevOps to Google. That also was not an option with Swarm, though it may be in the future.

      Reply
  1. Daniel Middleton (@monokal)

    Is it not the case that Swarm is more of a “fire and forget” deployment mechanism as opposed to Kubernetes which has replication controllers and a service abstraction which can transparently maintain a “desired state” as the underlying hosts fail? Admittedly I could be missing something, but I see Swarm more as a deployment tool, and Kubernetes more of a larger-scale orchestration platform for high availability.

    Reply
    1. Viktor Farcic Post author

      It is true that Kubernetes can maintain the desired state and Swarm cannot. However, with, for example Consul watches, it is fairly easy to do the same. Ping a service and if it doesn’t respond, re-deploy using the same process that was run initially. I think that the problem is bigger than simply maintaing the state. Sometimes random things happen and server stops working. That is a good case where re-deployment done by Kubernetes is useful (even though it’s fairly easy to do the same with Swarm/Consul combination). In many other cases, things failed for a reason that needs to be fixed in a way that is not a simple re-deployment. Someone needs to be notified with all the information and investigate why things failed. For example, it can be a memory leak that needs to be addressed by changing the code and not by simply re-deploying continuously.

      I might have been too harsh on Kubernetes. It’s a great tool. However, I think that its incompatibility with Docker API is the main problem. On the other hand, Docker is implementing features currently available in Kuberentes. The best example is networking and persistent volumes. Those were indeed one of the main arguments in favor of Kubernetes.

      Actually, if something more powerful than Swarm is needed, I think that Mesos is the right choice especially since it supports non-Docker deployments as well (no matter how much I like Docker, there are cases, like Hadoop, when it is not appropriate). I think that Kubernetes is somewhere in between. Not as friendly as Swarm nor as powerful as Mesos. On top of that, Swarm will probably be available on top of Mesos before it gets to be integrated with Kubernetes.

      Reply
      1. harryz

        Yes, that’s what kubernetes is growing to: eventually it will become a project which is powerful as Mesos and can support both big data workload and web servers, while swarm stays as a simple and docker friendly tool.

        One thing about Mesos is that it is never designed for docker/container, it’s just a cluster abstract layer like Swarm. People use mesos just because it the only one that supports deploying apps while manage hadoop at the same time. But you can not promise that in the future.

        Actually, we need to compare Compose+Swarm VS Kubernetes instead of swarm only, which as you said, k8s is current winner but docker is catching up. I think the only power of k8s over swarm is Pod (gang scheduling and container cooperating) and performance at large scale (that’s where google always played well). But we should also notice that in the future, multiple container runtimes may become a common case (even together with VM container runtime, see http://hyper.sh), while k8s is the only framework support that.

        So, if I am a developer, I will choose Compose+Swarm, it’s easy to use; but if I am a CTO, I will to choose k8s, or copy its architecture and code at least, I & my staff will never care about Docker API compatibility. We care about container runtime compatibility and hope to customize the container, which is again, only in k8s’ roadmap (for example, use runc+contained instead of Docker).

        Reply
      2. Maverick Crank GRey (@MaverickCGRey)

        If I understand you correctly, Kubernetes is more complex than Swarm. That is true.
        May I add a “health monitoring”, a “service discovery” with “canary releases” to Swarm with my custom homemade scripts? Yes, I may.

        Can these scripted “glue code” be less complex and more maintainable than already implemented Kubernetes logic? No, it cannot.

        Reply
          1. Torsten Bronger

            It would be simpler, however, it is still code you have to write, extend, and maintain. One should have really good reasons for re-inventing the wheel. Note that since recent releases, the installation procedure of k8s is not a good reason anymore – it has become simple enough.

            Reply
          2. harry

            One of fact is that k8s’ code/test quality is much better than Swarm (I promise that, and you can send patches to both projects to figure that out), and it’s a result of Google’s top engineers carefully maintenance (omega, borg team). That’s another reason that I doubt you can keep things much simpler to implement those features with the same quality in Swarm.

            Reply
  2. Pingback: Links 5/11/2015: Framing Linus Torvalds, NetBeans IDE 8.1 | Techrights

  3. Pingback: Docker Clustering Tools Compared: Kubernetes vs Docker Swarm | Technology Conversations on WordPress.com https://technologyconversations.com/2015/11/04/docker-clustering-tools-compared-kubernetes-vs-docker-swarm/

  4. Pingback: Docker Clustering Tools Compared: Kubernetes vs Docker Swarm | Technology Conversations on WordPress.com https://technologyconversations.com/2015/11/04/docker-clustering-tools-compared-kubernetes-vs-docker-swarm/

  5. Pingback: Docker集群工具比对:Kubernetes vs Docker Swarm | 小样儿(ShowYounger)

  6. Pingback: Consistent Container Deployments with IBM UrbanCode Deploy – Head in the Cloud

  7. Torsten Bronger

    When making a choice, it is also interesting to look at the momentum of the projects. Kubernetes currently has > 1000 commits per month, made by 100 people. Docker swarm is 150 commits by 12 people, and for docker compose, the numbers are approximately the same.

    Reply
    1. Viktor Farcic Post author

      I do not think you can compare it like that. Number of commits is similar as, for example, number of tests of an application. While having no commits (or no tests) is a very bad sign, how many you have should not be the main metric. You should also include the scope of those commits. It can develop a feature and commit it once, or commit every hour and end up with 20 commits. The size of a feature also matters.

      As for the number of contributors, that also depends on their dedication. You might have ten developers dedicated to a project, or you might have hundred partially working on it.

      I’m not trying to say which project has a bigger momentum because I do not truly know that. I’m only trying to point out that looking at a few numbers only is a very dangerous thing (not only in this case, but in general).

      I am in a constant contact with people working for Docker and can confirm that they are very dedicated to their projects and that Swarm is a crucial product for their strategy. Again, that does not mean much since I do not have such an insight into Kubernetes.

      My personal opinion is that Swarm will prevail because its tight integration with Docker and, almost, non-existing learning curve. That would not be such a big advantage if there would be other competitors for containers supremacy. At this moment, there is only Rocket which is not even close to having such an ecosystem as Docker does. Please keep in mind that this is only my “guts feeling”, not based on any substantial information.

      Reply
  8. Derek Robert Adair (@derleek)

    Great read.

    I’m in the final stages of planning my distributed application and I’ve decided to take a closer look at my clustering solution. Initially I was hooked with tutum and how they wrapped a web UI around swarm. However, I’m not happy with their new pricing model in docker UCP I’ve decided to explore managing my own cluster. I prefer the CLI anyways.

    I was pleasantly surprised by how easy it was to set up a cluster with swarm, however, the simplicity and power of google’s container engine is calling me. I feel like this is a really important distinction that wasn’t elaborated on in this article; Taking your cluster from dev to prod with k8s is basically automatic as I understand it.

    I digress though; I find myself wondering how difficult it would be to shift gears from swarm to kubernetes. Do you see any obvious drawbacks to using swarm to get things up and running in a distributed fashion and converting to k8s? Or if I like the bennies of k8s should I just bite the bullet and build it on that from the start?

    Reply
    1. Torsten Bronger

      Definitely the latter. There are reasons for not using k8s in the first place. However, if you are looking for a roadmap towards k8s in prod, it’s k8s along the whole way. Otherwise, you will experience unnecessary friction loss. Note that the deployment of k8s itself has been simplified greatly in recent releases. Last last serious obstacle in my opinion for small setups.

      Reply
    2. Ward K Harold

      There are also reasons for not using Swarm in the first place, but Torsten is correct; if you plan on using k8s at all – which I strongly recommend – you’re only making your life difficult by dabbling with Swarm. If you just want to play it’s trivial, and not too expensive, to spin up small clusters in the Google Container Engine.

      Reply
      1. Viktor Farcic Post author

        I fully agree with Ward. If you already decided to use k8s, don’t waste your time with Swarm. Even though the later is very easy to set up, their APIs are completely different and using one will not get you any closer to the other.

        Reply
  9. Pingback: Docker 集群工具比对:Kubernetes vs Docker Swarm – Zouyapeng-Blog

  10. Pingback: Docker Embeds Container Orchestration In Engine 1.12 | All about Hosting

  11. Pingback: Docker Clustering Tools Compared: Kubernetes vs Docker Swarm – The Enterprise Cloud Blog

  12. Pingback: Docker Swarm与Apache Mesos的区别 - 未分类 - Adocker

  13. Pingback: Docker Swarm与Apache Mesos的区别 - 未分类 - Adocker

  14. Pingback: Kubernetes, | Metzonalli

    1. Viktor Farcic Post author

      A lot of things changed since I wrote this article. Docker Compose is not used by Swarm any more. Swarm standalone is effectively dead in favour of Swarm Mode that is part of Swarm Engine, k8s installation is much simpler, and so on. Things in this space are changing very fast…

      Reply
  15. Pingback: Cloud Computing Project #3 – THE OTHER SIDES

  16. trajano

    I was thinking if you had a single API to more or less do what you need on a single server which is to build up a stack of your services using different platform technologies such as application servers, web servers, ElasticSearch, Hazelcast clusters, mariadb galeras.

    Where most of the infrastructure work is described through a single docker-compose.yml file and such yml file can be loaded into a clustered swarm for production with just a simple docker swarm deploy…

    Wouldn’t that just outweigh everything else?

    Reply
    1. Viktor Farcic Post author

      I wouldn’t go as far as to say that it overweights everything else but it is definitely a very useful feature. Being able to run the same stack everywhere (from dev. laptop to production) is very useful and a huge bonus in favor of Swarm.

      Reply
  17. shiv

    Awesome article. Please consider to redo the topic with current status of the projects.
    Can a POD(like) setup be done using swarm+compose?

    Reply
    1. Viktor Farcic Post author

      I wrote a lot on that topic since this one was published. This one is so old that the Swarm (standalone) used in it does not even exist anymore. Kubernetes, on the other hand, changed a lot. I’m planning a series of articles that will compare both feature by feature. I think the first article will be publish in a week or two.

      Reply
  18. Pingback: The Docker way - Coding Daniel

  19. Pingback: Dockers and Linux Containers 101 - Rouge Neuron

  20. Azure Trainings in Hyderabad

    I was thinking if you had a single API to more or less do what you need on a single server which is to build up a stack of your services using different platform technologies such as application servers, web servers, Elastic Search, Hazel cast clusters, maria DB galeras.

    Reply
  21. sai

    Snowflake is a cloud-based data warehousing platform that provides a fully managed service for data storage, processing, and analysis. It is known for its unique architecture and features, making it a popular choice for organizations dealing with large volumes of data. Snowflake Training In Hyderabad

    Reply

Leave a reply to Torsten Bronger Cancel reply