Reading Why Order Matters: Turing Equivalence in Automated Systems Administration (by Steve Traugott and Lance Brown) 15 years ago has been a career-changing moment for me. In this blog post, I will explore the meaning of some of the points made in this article for today’s data center infrastructures. I will also give a bit of background on what motivated our recent move to NixOS.
While still at university, my environment suggested that systems administrators were these poor guys pushing and kicking computers around until they run sufficiently stable so that application developers could program the “real” stuff. But Traugott and Brown taught me that writing code for self-administering machines is much more of a challenge than application programming. It is both more risk and more fun: while application programming depends on an axiomatic environment which cannot be changed by the application program, automated systems management means to modify the very foundation your system management tools build upon. So you can both easily shoot yourself in the foot and make magic things happen.
It is a question of perspective: Do I understand my role as an “insert the install CD and press enter” slave or as an infrastructure architect who constructs and maintains the whole data center as one giant, self-modifying machine? Ads from a major Unix vendor of that time got the point: The network is the computer. Seen this way, writing code that holds the data center together is the “real” thing and application programs are mere bricks of a larger building. I was hooked.
Now, 15 years later, I am working professionally as a systems engineer here at the Flying Circus. Let’s see how the concepts proposed back then apply to what we do today.
Classification of system management methods
One of the perhaps most catchy items of this paper (and the one I am going to concentrate on) is a fundamental classification of systems management methods. Traugott and Brown sort all possible approaches into three drawers:
- Divergent Management. The system’s state diverges from the described target state in a unforeseeable way. This is usually the result of manual intervention or bad management software. Think of trying to configure a set of machines manually from check lists, kicking off one-shot actions from install scripts, or simply SSHing into a machine and fuzzing around for “quick and dirty” fixes. There is no real separation between the installed system and user data. Divergent management results in low predictability.
- Convergent Management. The system’s state is compared against a target state description and, where differences are found, brought closer to the target. Many of the system management tools popular today are built on this paradigm. Note that a target state description is usually not exhaustive – there is still plenty of room for divergence. Our mental map of a system is largely white with a few known islands. Convergent management some predictability which may be sufficient depending on the context.
- Congruent Management. The system’s state is forced to be completely equal to a target state. Most system and container imaging techniques imply congruence as long as images are not modified at run time. Deviations should only occur in designated areas like databases or object storages. Our mental map of the system is largely complete with a few spots deliberately left white. Congruent management generally results in high predictability.
Some time ago, divergent management was the standard and we considered snowflake servers perhaps not sane but at least inevitable. 300 root filesystems were considered a large installation. By the advent of virtualization, managing 1000 root filesystems today is for starters only. This development pushed the field of automated system management so that we can choose today from a large selection of mature tools.
How do tools commonly found today fit into this classification? Obviously, divergent management is the default when using no automated approach at all. I have seen “curl to shell” install scripts all over the place. Commercial, shrink-wrapped software packages also tend to include install scripts with uncontrolled side effects. Apart from these questionable software installation methods, automated systems management sometimes create divergence on purpose. Think of secret keys or access tokens which must be managed to a degree, but must not be known to any outsider. Thus, they are not directly part of a described system state.
Many popular system management tools like Puppet, Chef, CFEngine, and Ansible employ convergent mechanisms. The target state is usually described in a declarative language. System descriptions are fed into an agent on the target system. For every language element, the agent knows how to determine deviations and apply corrective actions. The declarative nature of the configuration language implies that the actual order of operations performed on the target system is left to the agent’s scheduler. Not all corrective actions are expected to succeed in the first run; tools are built around the expectation that each run brings the system closer to its desired state and the delta eventually converges to an empty set.
The rise of cloud infrastructures gave machine imaging tools new popularity. Commonly used image formats like AMI describe virtual machines nearly completely and can been seen as congruent management. This holds true as long as those virtual machines don’t modify their filesystem during run time. The very essence of Immutable Infrastructure is to never touch virtual machines once they are running. If something needs to be changed, create a new image, start new VMs, shut down old VMs. Container images are based on the same principles. Image mastering tools like Dockerfile start with a baseline image and apply a defined list of modifications on top of that.
The new kid on the block, NixOS, follows a largely congruent method but does not rely on imaging. Instead, rigorous control is exercised by generating the whole system out of a description written in a functional expression language. Mechanisms like checksumming all build artefacts together with their whole dependency trees, using read-only installation trees be default, and using symlinks which point into the currently active installation root, ensure that the installation is always close to a well-defined target state.
Which management method is the best? (And: Which tool is the best?)
For a long time, it was clear that divergent management tended to back down into habitats of bad management practices and inferior technical skills (or a combination of both). Convergent management was state of the art, while congruent management was a freaky discipline for aerospace, nuclear, or financial institutions.
Now, a debate is taking place if congruent management (in the incarnation of immutable infrastructure) is the solution to all management and deployment problems. While it is arguably the strongest of the three models, it is not adequate in every situation.
Divergent management is sometimes the best way to get a particular system bootstrapped. Other legitimate uses include emergency response (but be sure to clean up after yourself) or (already mentioned) management of secret keys. Obvious, anything outside the scope of systems management like user data or database contents is divergent. In many cases one cannot guarantee that no bit of the managed part of an installation is ever affected by unmanaged data; just think of user actions filling up a disk partition and causing subsequent systems management actions to fail.
Convergent management gets the upper hand when it comes to reacting to external forces. Mark Burgess pioneered coined the term computer immunology to describe mechanisms that help computer systems get back into desired state after interfering with effects that are out of our control. Such external effects may be spinning up more virtual machines as load increases, catching up on areas that have been diverging over time, cleaning up databases, etc. A congruent system advances from one well-defined state to the next one. On unexpected deviations, congruent management is expected to fail, since these are usually signs for bugs, component failures, or security breaches. Convergent mechanisms are always needed to control run-time behaviour: for example, you need them to ensure that daemons are always restarted after their configuration has changed.
A large advantage of congruent management over convergent management is that the latter is susceptible to ordering problems. Each step towards convergence potentially affects the way successive steps are executed. Convergence involves a feedback loop which may get out of control or halt in a deadlock. The great strength of convergent management, being able to react to unforseen interferences, is also its weak point: there is no way to predict the exact steps performed when aiming for convergence. This effect has been repeatedly a source of bad surprises on production systems for us.
Congruently managed systems, at least when realized as immutable infrastructure, provide only a very coarse-grained mechanism even for minor changes: rebuild. This “nuclear approach” places a burden on the infrastructure and seems to trigger avoidance reflexes. This grows into a real problem when immutable servers are not that immutable at all and become in fact snowflakes. Another case where the practice of immutable infrastructure fails to deliver its congruent promise is that the build environment for images must be tightly controlled. For example, Docker images built on different machines are not necessarily identical even when using the same Dockerfile.
In my opinion, every method has its place in any sufficiently complex infrastructure. It is necessary to reflect on which particular piece needs to be managed in which way. We should reach out for congruence where it is possible, but deliberately decide to use convergent or even divergent management where it seems to be superior.
What does this mean for the Flying Circus?
We at Flying Circus try to provide the best platform available for co-managed services. This means that system installations should be as reliable and predictable as possible. But, and this is equally important, we want to make it easy for you to deploy custom applications on virtual machines. This means that virtual machines must be modifiable and long-lived. Immutable machines are clearly out.
For several years, our hosting platform featured convergent management using Puppet and a few scripts which adhere to convergent principles. Combined with controlled baseline images and minimal manual interaction, we thought that this brought us near to congruence. But as our management code base evolved, we missed a critical point outlined in the original paper: Traugott recommends to cut new baseline images from time to time which represent the layers of sysadmin work done in between. Not doing so means that machines running for some time diverge slowly from those bootstrapped recently.
So why did we not cut new baseline images from time to time? In fact we did. But they represent only new releases from our upstream Gentoo distribution. Our baseline image is quite generic. Virtual machines get specialized to their designated roles (web server, database server, …) after first boot. While this kind of flexibility is great, it prevents freezing a new baseline image from a running machine unless you do it separately for every combination of roles.
This problem, among others, motivated us to base our next generation platform on NixOS. With NixOS, we can separate the realms of congruence, convergence, and divergence cleanly within the same machine. We exercise tight control on system installation, manage runtime dependencies and databases in a convergent way and let divergence flourish where appropriate. Moreover, you (as our customer) can also create your own Nix expressions to bring parts of your service deployment under congruent control.
All of this is very exciting. I am feeling that we are currently pushing platform management to the next level. As our experience with a NixOS-based, co-managed platform grows, I will highlight interesting aspects in future blog posts.