The article that follows is an extract from the last chapter of The DevOps 2.2 Toolkit: Self-Sufficient Docker Clusters book. It provides a good summary into the processes and tools we explored in the quest to build a self-sufficient cluster that can (mostly) operate without humans.
We split the tasks that a self-sufficient system should perform into those related to services and those oriented towards infrastructure. Even though some of the tools are used in both groups, the division between the two allowed us to keep a clean separation between infrastructure and services running on top of it. Continue reading →
If you liked this article, you might be interested in The DevOps 2.2 Toolkit: Self-Sufficient Docker Clusters book. The book goes beyond Docker and schedulers and tries to explore ways for building self-adaptive and self-healing Docker clusters. If you are a Docker user and want to explore advanced techniques for creating clusters and managing services, this book might be just what you’re looking for.
Please get a copy from Amazon, LeanPub, or look for it through your favorite book seller.
Give the book a try and let me know what you think.
A self-sufficient system is a system capable of healing and adaptation. Healing means that the cluster will always be in the designed state. As an example, if a replica of a service goes down, the system needs to bring it back up again. Adaptation, on the other hand, is about modifications of the desired state so that the system can deal with changed conditions. A simple example would be increased traffic. When it happens, services need to be scaled up. When healing and adaptation are automated, we get self-healing and self-adaptation. Together, they both a self-sufficient system that can operate without human intervention.
How does a self-sufficient system look? What are its principal parts? Who are the actors? Continue reading →
Any system that intends to be fully automated and self-sufficient must be capable of self-healing and self-adaptation. As a minimum, it needs to be able to monitor itself and perform certain actions both on service and infrastructure levels.
Two axes can represent the set of actions a system might execute. One group of actions be represented through the differences between infrastructure and services. The other axis can be explained by the type of activities, with self-healing on one end, and self-adaptation on the other. Continue reading →
In the Forwarding Logs From All Containers Running Anywhere Inside A Docker Swarm Cluster article, we managed to add centralized logging to our cluster. Logs from any container running inside any of the nodes are shipped to a central location. They are stored in ElasticSearch and available through Kibana. However, the fact that we have easy access to all the logs does not mean that we have all the information we would need to debug a problem or prevent it from happening in the first place. We need to complement our logs with the rest of the information about the system. We need much more than what logs alone can provide. Continue reading →
At the beginning of 2016, I published The DevOps 2.0 Toolkit. It took me a long time to finish it. Much longer than I imagined.
I started by writing blog posts in TechnologyConversations.com. They become popular and I received a lot of feedback. Through them, I clarified the idea behind the book. The goal was to provide a guide for those who want to implement DevOps practices and tools. At the same time, I did not want to write a material usable to any situation. I wanted to concentrate only on people that truly want to implement the latest and greatest practices. I hoped to make it go beyond the “traditional” DevOps. I wished to show that the DevOps movement matured and evolved over the years and that we needed a new name. A reset from the way DevOps is implemented in some organizations. Hence the name, The DevOps 2.0 Toolkit. Continue reading →