Centralized System and Docker Logging with ELK Stack

With Docker there was not supposed to be a need to store logs in files. We should output information to stdout/stderr and the rest will be taken care by Docker itself. When we need to inspect logs all we are supposed to do is run docker logs [CONTAINER_NAME].

With Docker and ever more popular usage of micro services, number of deployed containers is increasing rapidly. Monitoring logs for each container separately quickly becomes a nightmare. Monitoring few or even ten containers individually is not hard. When that number starts moving towards tens or hundreds, individual logging is unpractical at best. If we add distributed services the situation gets even worst. Not only that we have many containers but they are distributed across many servers.

The solution is to use some kind of centralized logging. Our favourite combination is ELK stack (ElasticSearch, LogStash and Kibana). However, centralized logging with Docker on large-scale was not a trivial thing to do (until version 1.6 was released). We had a couple of solutions but none of them seemed good enough.

We could expose container directory with logs as a volume. From there on we could tell LogStash to monitor that directory and send log entries to ElasticSearch. As an alternative we could use LogStash Forwarder and save a bit on server resources. However, the real problem was that there was not supposed to be a need to store logs in files. With Docker we should output logs only to stdout and the rest was supposed to be taken care of. Besides, exposing volumes is one of my least favourite things to do with Docker. Without any exposed volumes containers are much easier to reason with and move around different servers. This is especially true when a set of servers is treated as a data center and containers are deployed with orchestration tools like Docker Swarm or Kubernetes.

There were other options but they were all hacks, difficult to set up or resource hungry solutions.

Docker Logging Driver

In version 1.6 Docker introduced logging driver feature. While it passed mostly unnoticed, it is a very cool capability and a huge step forward in creating a comprehensive approach to logging in Docker environments.

In addition to the default json-file driver that allows us to see logs with docker logs command, we now have a choice to use syslog as an alternative. If set, it will route container output (stdout and stderr) to syslog. As a third option, it is also possible to completely suppress the writing of container output to file. That might save HD usage when that is of importance but in most cases it is something hardly anyone will need.

In this post we’ll concentrate on syslog and ways to centralize all logs to a single ELK instance.

Docker Centralized Logging

We’ll setup ELK stack, use Docker syslog log driver and, finally, send all log entries to a central location with rsyslog. Both syslog and rsyslog are pre-installed on almost all Linux distributions.

We’ll go through manual steps first. At the end of the article there will be instructions how to set up everything automatically with Ansible. If you are impatient, just jump to the end.

Vagrant

Examples in this article are based on Ubuntu and assume that you have Docker up and running. If that’s the case, please skip this chapter. For those that do not have an easy access to Ubuntu, we prepared a Vagrant VM with all that is required. If you don’t have it already, install VirtualBox and Vagrant. Once everything is set, run following commands:

git clone https://github.com/vfarcic/docker-logging-elk.git
cd docker-logging-elk
vagrant up monitoring
vagrant ssh monitoring

After those commands are executed you should be inside the Ubuntu shell with Docker installed. Now we’re ready to set up ELK.

ElasticSearch

We’ll store all our logs in ElasticSearch database. Since it uses JSON format for storing data, it will be easy to send any type of log structure. More importantly, ElasticSearch is optimized for real-time search and analytics allowing us to navigate through our logs effortlessly.

Let us run the official ElasticSearch container with port 9200 exposed. We’ll use volume to persist DB data to the /data/elasticsearch directory.

sudo mkdir -p /data/elasticsearch
sudo docker run -d --name elasticsearch -p 9200:9200 -v /data/elasticsearch:/usr/share/elasticsearch/data elasticsearch

LogStash

LogStash is a perfect solution to centralize data processing of any type. Its plugins allow us a lot of freedom to process any input, filter data and produce one or more outputs. In this case, input will be syslog. We’ll filter and mutate Docker entries so that we can distinguish them from the rest of syslog. Finally, we’ll output results to ElasticSearch that we already set up.

Container will use official LogStash image. It will expose port 25826, share host volume /data/logstash/config that will provided required configuration and, finally, will link to the ElasticSearch.

sudo docker run -d --name logstash --expose 25826 -p 25826:25826 -p 25826:25826/udp -v $PWD/conf:/conf --link elasticsearch:db logstash logstash -f /conf/syslog.conf

This container was run with -f flag that allows us to specify a configuration we’d like to use. The one we’re using (syslog.conf) looks like following.

input {
  syslog {
    type => syslog
    port => 25826
  }
}

filter {
  if "docker/" in [program] {
    mutate {
      add_field => {
        "container_id" => "%{program}"
      }
    }
    mutate {
      gsub => [
        "container_id", "docker/", ""
      ]
    }
    mutate {
      update => [
        "program", "docker"
      ]
    }
  }
}

output {
  stdout {
    codec => rubydebug
  }
  elasticsearch {
    host => db
  }
}

Those not used to LogStash configuration format might be a bit confused so let us walk you through it. Each configuration can have, among others, input, filter and output sections.

In the input section we’re telling LogStash to listen for syslog events on port 25826.

Filter section is a bit more complex and, in a nutshell, it looks for docker as a program (Docker logging driver is registering itself as a program in format docker/[CONTAINER_ID]). When it finds such program, it mutates its value into a new field container_id and updates program to be simply docker.

Finally, we’re outputting results into stdout (using rubydebug codec) and, more importantly, ElasticSearch.

With ElasticSearch up and running and LogStash listening on syslog events, we are ready to set up rsyslog.

rsyslog

rsyslog is a very fast system for processing logs. We’ll use it to deliver our syslog entries to LogStash. It is already pre-installed in Ubuntu so all we have to do is configure it to work with LogStash.

sudo cp conf/10-logstash.conf /etc/rsyslog.d/.
sudo service rsyslog restart

We copied a new rsyslog configuration and restarted the service. Configuration file is very simple.

*.* @@10.100.199.202:25826

It tells rsyslog to send all syslog entries (*.*) to a remote location (10.100.199.202:25826) using TCP protocol (@@). In this case, 10.100.199.202 is the IP of Vagrant VM we created. If you’re using your own server, please change it to the correct IP. Also, the IP is not a remote location since we are using only one server in this example. In “real” world scenarios you would set rsyslog on each server to point to the central location where ELK is deployed.

Now that we are capturing all syslog events and sending them to ElasticSearch, it is time to visualize our data.

Kibana

Kibana is an analytics and visualization platform designed to work with real-time data. It integrates seamlessly with ElasticSearch.

Unlike ElasticSearch and LogStash, there is no official Kibana image so we created one for you. It can be found in Docker Hub under vfarcic/kibana.

We’ll expose port 5601 and link it with the ElasticSearch container.

sudo docker run -d --name kibana -p 5601:5601 --link elasticsearch:db vfarcic/kibana

Kibana allows us to easily setup dashboards to serve the needs specific to the project or organization. If you are a first time user, you might be better of with an example Dashboard created to showcase the usage of system and Docker logs coming from syslog.

Following command with import sample Kibana settings. It will use a container with ElasticDump tool that provides import and export tools for ElasticSearch. The container is custom-made and can be found in Docker Hub under vfarcic/elastic-dump.

sudo docker run --rm -v $PWD/conf:/data vfarcic/elastic-dump --input=/data/es-kibana.json --output=http://10.100.199.202:9200/.kibana --type=data

Once this container is run, Kibana configuration from the file es-kibana.json was imported into the ElasticSearch database running on 10.100.199.202:9200.

Now we can take a look at Kibana dashboard that we just setup. Open http://localhost:5601 in your favourite browser.

The initial screen shows all unfiltered logs. You can add columns from the left hand side of the screen, create new searches, etc. Among Kibana settings that we imported there is a saved search called error. It filters logs so that only those that contain word error are displayed. We could have done it in many other ways but this one seemed easiest and most straightforward. Open it by clicking the Load Saved Search icon in the top-right corner of the screen. At the moment it shows very few or no errors so let us generate a few fake ones.

logger -s -p 1 "This is fake error..."
logger -s -p 1 "This is another fake error..."
logger -s -p 1 "This is one more fake error..."

Refresh the Kibana screen and you should see those three error messages.

Besides looking at logs in a list format, we can create our own dashboards. For example, click on the Dashboard top menu, then Load Saved Dashboard and select error (another one that we imported previously).

kibana

Running Container with syslog

Up until this moment we saw only system logs so let us run a new Docker container and set it up to use syslog log driver.

sudo docker run -d --name bdd -p 9000:9000 --log-driver syslog vfarcic/bdd

If we go back to Kibana, click on Discover and open Saved Search called docker, there should be, at least, three entries.

That’s it. Now we can scale this to as many containers and servers as we need. All there is to do is set up rsyslog on each server and run containers with --log-driver syslog. Keep in mind that standard docker logs command will not work any more. That’s the expected behaviour. Logs should be in one place, easy to find, filter, visualize, etc.

We’re done with this VM so let us exit and stop it.

exit
vagrant halt

Ansible

The repository that we cloned at the beginning of this article contains two more Vagrant virtual machines. One (elk) contains ElasticSearch, LogStash and Kibana. The other (docker-node) is a separate machine with one docker container up and running. While manual setup is good as a learning exercise, orchestration and deployments should be automated. One way to do that is with Ansible so let’s repeat the same process fully automated.

vagrant up elk
vagrant up docker-node

Open Kibana in http://localhost:5601. It is exactly the same setup as the one we did manually but done automatically with Ansible.

Full source code can be found in the vfarcic/docker-logging-elk GitHub repository. Feel free to take and modify Ansible setup and adapt it to your needs.

If you had trouble following this article or have any additional question, please post a comment or contact me on email (info is in the About section).

The DevOps 2.0 Toolkit

The DevOps 2.0 ToolkitIf you liked this article, you might be interested in The DevOps 2.0 Toolkit: Automating the Continuous Deployment Pipeline with Containerized Microservices book.

This book is about different techniques that help us architect software in a better and more efficient way with microservices packed as immutable containers, tested and deployed continuously to servers that are automatically provisioned with configuration management tools. It’s about fast, reliable and continuous deployments with zero-downtime and ability to roll-back. It’s about scaling to any number of servers, design of self-healing systems capable of recuperation from both hardware and software failures and about centralized logging and monitoring of the cluster.

In other words, this book envelops the whole microservices development and deployment lifecycle using some of the latest and greatest practices and tools. We’ll use Docker, Kubernetes, Ansible, Ubuntu, Docker Swarm and Docker Compose, Consul, etcd, Registrator, confd, Jenkins, and so on. We’ll go through many practices and, even more, tools.

Advertisements

22 thoughts on “Centralized System and Docker Logging with ELK Stack

  1. Dave

    I’m loving your articles on Docker…..you’re blog is permanently bookmarked on my browser favorites bar 😉 Hope to see more soon!

    Kind regards,

    Dave

    Reply
    1. Viktor Farcic Post author

      Hi,

      It looks great. However, if there are multiple servers you would need to have LogStash on each of them and, probably, make it send data to ES. The problem with LogStash is that it is very resource demanding and I prefer only one LogStash instance and all servers sending logs through rsyslog, collectd or LogStash Forwarder (previously Lumberjack).

      Reply
  2. Ernest

    Huge fan of ELK solution. Love when simple things provides an amazing result! Another example of it is Swagger (In order to document services in a soa/ microservices paragdim).

    Keep pushing!

    Reply
  3. Marc Bogaerts

    I followed this example with this exception that I used the official kibana docker image. All works great, I can see the log messages come into ES. However kibana does not display them. I cannot see what the problem is. Any toughts?

    Reply
    1. Viktor Farcic Post author

      I assume that in Kibana you queried all records from ES and none was available. If that’s the case, the only explanation I have is that Kibana is not connected to ES or not to the same instance. I’m currently on vacations and without laptop. If you don’t solve it earlier, we can debug this together through HangOuts once I’m back (mid August). If that is not too late, please contact me on email or HangOuts (you can find contact details in the “About” section of the blog.

      Reply
      1. Marc Bogaerts

        On the contrary: I queried all records from ES and got a lot. Even in kibana, I can see lots of messages. However I do not see anything when I apply the search in kibana as explained above. I think for a start I will investigate a bit how ES and kibana actually work as I am a complete novice wrt these tools.
        I suspect my problems might be version related since these are slightly newer than the ones you listed above.
        I appreciate you replied so quickly and your offer for help. I don’t want to disturb your vacation either. I will search some more in the documentation and keep you informed of any progress.
        Thanks a lot!

        Reply
  4. eskp

    Great write up, Viktor.
    I am too using the official kibana image and I cannot access Kibana dashboard at all at localhost:5601 From “docker logs -f kibana” I can see it complains about not seeing elasticsearch:
    “Unable to connect to elasticsearch at http://localhost:9200. Retrying in 2.5 seconds.”

    Reply
    1. Viktor Farcic Post author

      At the time I wrote this article, there was no official Kibana container hence I created my own (the one used in the text and Ansible playbooks). I’m not sure what would need to change when using the official Kibana image. Unfortunatelly, I have a few very tight deadlines and won’t be able to change the code and rewrite the article for at least next couple of weeks. If you find the solution in the mean time, please let me know.

      Reply
      1. eskp

        I was linking kibana and elasticsearch wrong. I was using your “db” alias when it was expecting “elasticsearch” to correctly set env var.

        Reply
  5. Pingback: Streaming docker logs to an ELK stack | Selling Solutions

  6. Akhan Ismailov

    Here in syslog.conf:


    output {
    stdout {
    codec => rubydebug
    }
    elasticsearch {
    host => db
    }
    }

    use hosts instead of host.

    Reply
  7. David Doherty

    I am trying to follow this example to learn about central logging and am having no luck in getting it running. There are several mentions about updated software, so I believe that is what my problem is. Do you have an updated article on this topic? I will keep searching as we are trying to get a centralized logging process working for our containers.

    Reply
  8. JP

    FYI: After trying to run the elasticsearch container for the first time the elasticsearch container stopped with an error and a docker service restart solved the problem. This is the error:

    Error response from daemon: driver failed programming external connectivity on endpoint elasticsearch (…): (iptables failed: iptables –wait -t nat -A DOCKER -p tcp -d 0/0 –dport 9200 -j DNAT –to-destation 172.17.0.2:9200 ! -i docker0: iptables: No chain/target/match by that name.

    Reply
  9. Viktor Farcic Post author

    To be honest, this is the first time I see this one. There are too many combinations…

    The only thing I can say is that it is most likely related to Docker networking (Overlay).

    I’m sorry for not being of much help.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s