Kubernetes is probably the biggest project we know. It is vast, and yet many think that after a few weeks or months of reading and practice they know all there is to know about it. It’s much bigger than that, and it is growing faster than most of us can follow. How far did you get in Kubernetes adoption?
From my experience, there are four main phases in Kubernetes adoption.
In the first phase, we create a cluster and learn intricacies of Kube API and different types of resources (e.g., Pods, Ingress, Deployments, StatefulSets, and so on). Once we are comfortable with the way Kubernetes works, we start deploying and managing our applications. By the end of this phase, we can shout “look at me, I have things running in my production Kubernetes cluster, and nothing blew up!” I explained most of this phase in The DevOps 2.3 Toolkit: Kubernetes.
In this article, we’ll discuss a way to forward logs from containers created as Docker Swarm services inside our clusters. We’ll use the ELK stack. They’ll be forwarded from containers to LogStash and, from there, to ElasticSearch. Once in the database, they will be available through Kibana.
I have so much chaos in my life, it’s become normal. You become used to it. You have just to relax, calm down, take a deep breath and try to see how you can make things work rather than complain about how they’re wrong.
— Tom Welling
Monitoring many services on a single server poses some difficulties. Monitoring many services on many servers requires a whole new way of thinking and a new set of tools. As you start embracing microservices, containers, and clusters, the number of deployed containers will begin increasing rapidly. The same holds true for servers that form the cluster. We cannot, anymore, log into a node and look at logs. There are too many logs to look at. On top of that, they are distributed among many servers. While yesterday we had two instances of a service deployed on a single server, tomorrow we might have eight instances deployed to six servers. The same holds true for monitoring. Old tools, like Nagios, are not designed to handle constant changes in running servers and services. We already used Consul that provides a different, not to say new, approach to managing near real-time monitoring and reaction when thresholds are reached. However, that is not enough. Real-time information is valuable to detect that something is wrong, but it does not give us information why the failure happened. We can know that a service is not responding, but we cannot know why.
With Docker there was not supposed to be a need to store logs in files. We should output information to stdout/stderr and the rest will be taken care by Docker itself. When we need to inspect logs all we are supposed to do is run
docker logs [CONTAINER_NAME].
With Docker and ever more popular usage of micro services, number of deployed containers is increasing rapidly. Monitoring logs for each container separately quickly becomes a nightmare. Monitoring few or even ten containers individually is not hard. When that number starts moving towards tens or hundreds, individual logging is unpractical at best. If we add distributed services the situation gets even worst. Not only that we have many containers but they are distributed across many servers.
The solution is to use some kind of centralized logging. Our favourite combination is ELK stack (ElasticSearch, LogStash and Kibana). However, centralized logging with Docker on large-scale was not a trivial thing to do (until version 1.6 was released). We had a couple of solutions but none of them seemed good enough.