A self-sufficient system is a system capable of healing and adaptation. Healing means that the cluster will always be in the designed state. As an example, if a replica of a service goes down, the system needs to bring it back up again. Adaptation, on the other hand, is about modifications of the desired state so that the system can deal with changed conditions. A simple example would be increased traffic. When it happens, services need to be scaled up. When healing and adaptation are automated, we get self-healing and self-adaptation. Together, they both a self-sufficient system that can operate without human intervention.
How does a self-sufficient system look? What are its principal parts? Who are the actors? Continue reading →
Any system that intends to be fully automated and self-sufficient must be capable of self-healing and self-adaptation. As a minimum, it needs to be able to monitor itself and perform certain actions both on service and infrastructure levels.
Two axes can represent the set of actions a system might execute. One group of actions be represented through the differences between infrastructure and services. The other axis can be explained by the type of activities, with self-healing on one end, and self-adaptation on the other. Continue reading →
In the Forwarding Logs From All Containers Running Anywhere Inside A Docker Swarm Cluster article, we managed to add centralized logging to our cluster. Logs from any container running inside any of the nodes are shipped to a central location. They are stored in ElasticSearch and available through Kibana. However, the fact that we have easy access to all the logs does not mean that we have all the information we would need to debug a problem or prevent it from happening in the first place. We need to complement our logs with the rest of the information about the system. We need much more than what logs alone can provide. Continue reading →
At the beginning of 2016, I published The DevOps 2.0 Toolkit. It took me a long time to finish it. Much longer than I imagined.
I started by writing blog posts in TechnologyConversations.com. They become popular and I received a lot of feedback. Through them, I clarified the idea behind the book. The goal was to provide a guide for those who want to implement DevOps practices and tools. At the same time, I did not want to write a material usable to any situation. I wanted to concentrate only on people that truly want to implement the latest and greatest practices. I hoped to make it go beyond the “traditional” DevOps. I wished to show that the DevOps movement matured and evolved over the years and that we needed a new name. A reset from the way DevOps is implemented in some organizations. Hence the name, The DevOps 2.0 Toolkit. Continue reading →
This article continues where Docker Swarm Introduction left. I will assume that you have at least a basic knowledge how Swarm in Docker v1.12+ works. If you don’t, please read the previous article first.
The fact that we can deploy any number of services inside a Swarm cluster does not mean that they are accessible to our users. We already saw that the new Swarm networking made it easy for services to communicate with each other.
Let’s explore how we can utilize it to expose them to the public. We’ll try to integrate a proxy with the Swarm network and further explore benefits version v1.12 brought. Continue reading →
A lot changed since I published that article. The Swarm as a standalone container is deprecated in favor of Swarm Mode bundled inside Docker Engine 1.12+. On the other hand, the Docker Flow: Proxy advanced and became more feature rich and advanced. I suggest you check out the project README instead this article.
The goal of the Docker Flow: Proxy project is to provide a simple way to reconfigure proxy every time a new service is deployed or when a service is scaled. It does not try to “reinvent the wheel”, but to leverage the existing leaders and combine them through an easy to use integration. It uses HAProxy as a proxy and Consul as service registry. On top of those two, it adds custom logic that allows on-demand reconfiguration of the proxy. Continue reading →
Organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations” - M. Conway
Many will tell you that they have a scalable system. After all, scaling is easy. Buy a server, install WebLogic (or whichever other monster application server you’re using) and deploy your applications. Then wait for a few weeks until you discover that everything is so “fast” that you can click a button, have some coffee, and, by the time you get back to your desk, the result will be waiting for you. What do you do? You scale. You buy few more servers, install your monster applications servers and deploy your monster applications on top of them. Which part of the system was the bottleneck? Nobody knows. Why did you duplicate everything? Because you must. And then some more time passes, and you continue scaling until you run out of money and, simultaneously, people working for you go crazy. Today we do not approach scaling like that. Today we understand that scaling is about many other things. It’s about elasticity. It’s about being able to quickly and easily scale and de-scale depending on variations in your traffic and growth of your business, and that, during that process, you should not go bankrupt. It’s about the need of almost every company to scale their business without thinking that IT department is a liability. It’s about getting rid of those monsters. Continue reading →