“The Fashionable Negligence of the Day” and the Three Body Problem: Mark Burgess On Our Industry-Wide Problems With Configuration Management (Yet...the Stock Market Still Works?).

More than 30 years after CFEngine, Mark Burgess is still thinking about configuration management. He has a new post on configuration and how it can't be made easy, and in this article I look through some of his more provocative and interesting statements.

Three body problem and plates in the air
Three body problem and plates in the air

I always say that electronic computers are relatively new, maybe 90 years or so. Configuration management is one of those areas of computing that has a bit of (relative) history. If you've worked in something like system administration or DevOps, you've probably had experience with configuration management tools. Terraform is a common and popular one now; it's a tool that works at a higher level of abstraction than most other automation tools, because it interfaces with public cloud APIs to build and configure objects.

Before Terraform, of course, there were other tools, many of which were designed to configure certain text files and start services on a plain old Linux host, as Burgess says, "software settings". These tools will put a file somewhere, add some configuration to it, start a service, and so on, and so on, to build some sort of larger system.

However, like many things in computing, configuration management is a lot harder than it looks. Even defining and agreeing on what "configuration" means is difficult. Add to that the idea of looking at the configuration of a particular setting over a period of time, rather than as a static state, and the complexity explodes.

But back to history: one of the first tools for managing software settings was CFEngine, created by Mark Burgess.

The CFEngine project began in 1993 as a way for author Mark Burgess (then a post-doctoral fellow of the Royal Society at Oslo University, Norway) to get his work done by automating the management of a small group of workstations in the Department of Theoretical Physics. Burgess managed Unix workstations, scripting and fixing problems for users manually. Scripting took too much time, the flavours of Unix were significantly different, and scripts had to be maintained for multiple platforms, drowning in exception logic. After discussing the problems with a colleague, Burgess wrote the first version of CFEngine (the configuration engine) which was published as an internal report[4] and presented at the CERN computing conference. It gained significant attention from a wider community because it hid platform differences using a domain-specific language. - https://en.wikipedia.org/wiki/CFEngine

CFEngine paved the way for Chef, Puppet, Ansible, Terraform, etc. But where are we now?

30 Years Later...

30-plus years after CFEngine, Burgess is still thinking about configuration management. Below he has a new post on configuration and how it can't be made simple.

Why can’t “configuration” be made simple?
A (rather long) view on the configuration-compliance-coherence trust problem in IT

It's a wide-ranging work, and here I present some of the more provocative and interesting statements he makes. The emphasis in some of the quotes is my own. You could spend weeks just learning the themes behind these quotations.

Technology and Culture

There's some hot takes in here. 😄

  • "More spuriously, technology has become a fashion accessory rather than an engineering decision for many in the 21st century"
  • "There have never been more companies trying to reinvent existing tech than we see today. Many of them fail because they ignore the work that was done in the past and try to enforce models that cannot be enforced. We don’t seem to be able to learn from the past. Stubborn pride?"
  • "Engineers tend to choose products based on brand rather than technical analysis."
  • "In the 21st century, engineers want to be part of a tribe."
  • "The IT community can't even agree on what configuration means."
  • "Engineers don’t always make the right choices for users."
  • "Somewhat bizarrely, we use loose and fluffy ideas like quorum (borrowed from human management) to determine crucial correctness means for critical outcomes (i.e. a majority vote by possibly inconsistent opinion holders)."
  • "From prior research, we know that decision selection is a formally hard problem, yet technologists increasingly reject the science and pursue their own priorities."
  • "When I reviewed and edited the SRE books for Google, it was clear that the authors were making up the technical stuff as they went along."

Static Patterns

Our production systems are not static, and change over time, often altering themselves without our interaction, despite what we think.

  • "Too often, with infrastructure, we imagine a fixture rather than a part of an adaptive process. The same is true in IT."

This probably has a lot to do with our desire to abstract:

  • "A common solution is to impose an artificial simplicity onto software and users as a condition of use, in order to get on with more interesting aspects of the code."
  • "In IT, most approaches to configuration deal only with static patterns: data fixed once, for all, and forever at the beginning of an installation. Once set, we don’t expect to touch these values again. But this ignores drift, erosion, and other maintenance issues like garbage collection that unintentionally change the conditions of the system."
  • "The problem of managing real time change in “complete computer systems’’ is still woefully neglected, even after 30 years of work. The result is that, every year some new emergency configuration language or system gets invented to “solve” the immediate problems that occur due to whatever happens to be the fashionable negligence of the day."
  • "The Three Body Problem in physics is already too complex to be fully reliable. So we don’t expect even initially simple configurations to remain simple over time, as we do in IT."
  • "In IT, most people think that configuration only refers to the initial conditions of a trajectory."
  • "A static representation (call it a specification) of an intended state either at some reference time, usually the start or the end (e.g. the musical score or computer preferences), and the process by which that intent is realised or evolved (the playing or execution algorithm)."
  • "Configuration is once again treated like a paint job, or a cinematic screen projection, rather than a healing operation."

Intent

"Uploading a pattern" sounds like a cool startup idea. 😸

  • "It might seem that communicating intent is “just about uploading a pattern”, but it’s also about coherence of design at all scales, especially when it serves society at large."

I like this set of definitions for declarative and imperative:

  • "A declarative language attempts to describe an intended outcome as a reference for later audit. An imperative language is a way to describe a precise set of actions to be followed without improvisation or interpretation."
  • "Configuration language is not trying to express poetry in many colours, but it needs at least one way to express whatever we intend to create, in every context."

Regulation

It would seem Burgess expects more regulation for IT systems.

  • "With newer regulations like DORA, European businesses are now expected to be able to explain this as part of their risk plans by law."
  • "...regulation is coming, from the highest levels to ensure this is not left to the whim of companies with other agendas. Resistance is futile."

Promise Theory and Complexity

Promise Theory starts to get into the academic side of configuration.

A Theory of Voluntary Cooperation
website description
  • "In a promise decomposition, every independent resource would be represented by a distinct agent, and every possible variation would be a promise. A computer is an agent, in one sense, but it is also made of many smaller agents including files with static or variable content, static or variable software, and static or variable tasks running."
  • "All these ideas have been brought together in a general description of intent and state that I call Semantic Spacetime that began with Promise Theory."

I very much like the idea of what we might consider as "configuration" really being a kind of agent.

  • "The basic contention in Promise Theory is that each agent can only do what it can promise to do."
  • "What we call an agent in IT depends on the scale of what we’re doing and the isolation of process that implies. The focus has shifted over the decades, due to the economics of shared computing"
  • "What sabotages simplicity is not the use of a language per se, but rather the vast number of order-dependent degrees of freedom involved in making changes to data representations like text files."
  • "To eliminate ambiguities and freedoms, one has to rely on fault tolerance by design to work around this."
  • "The backlash against using expressive languages to describe configuration, which began around a decade ago, saw a return to pure data list formats like YAML and JSON where semantics were entirely implicit, and parameterized patterns were removed. People argued that all the target files for software were just data, so why not make the specification pure data too? For microservices, these were relatively small, so manageable by a two pizza team."
  • "By breaking up processes into independent agents, microservices follow sound Promise Theory reasoning, but stop short of explaining how the consistency of the whole can be assured. There is no language of cooperation except the socialisation of DevOps to keep that promise."

Garbage Collection and Biological Systems

  • "...when a biological system fails to perform its function due to changes getting out of balance, the entity voluntarily dies and is recycled (called apoptosis). This is the model we have begun to adopt for cloud computing."
  • "If making a new one is cheaper than repairing the old, we use garbage collection to eliminate faults. This too is an overhead at both design and execution time."

Over-Simplification

Computing is super complicated. We keep trying to simplify it, but we can't "decomplexify" it, we can't destroy the complexity, we can only abstract it away. But there are situations where this abstraction, this avoidance of complexity, may cause more problems than it solves.

  • "As engineers, we don’t want too much fuss. We start out hoping for maximum simplicity: an economy of effort over the lifecycle of software."
  • "Are we coming full circle with configuration? There’s been a few attempts to equip versions of YAML with more powerful introspective features lately, and improved compression for JSON in protocol messages. "
  • "Our narrative in computing is to keep everything simple, rather than be accurate or sustainable. Einstein famously said: everything should be made as simple as possible, but no simpler! This is where we go wrong in IT. We oversimplify because of laziness and the race to get to more exciting issues."
  • "IT languages fail to describe intentions well. IT languages fail to describe change well. They are designed for imposition not repair."
  • "For some, the answer to everything is AI. For the time being, AI is a non-issue. AI can’t tell us what we want. Or rather, when we begin to allow it then we’ve already lost ourselves."
  • "If we want to make permanent progress, it’s up to the present generation to swallow their pride and embrace knowledge."

Many Plates in the Air

Burgess' lengthy blog post covers decades of work and it's well worth a thorough read. A key part of Burgess' article seems to be that he expects more regulation to be created to ensure that systems are available and understandable to auditors, and that this regulation will drive the development of better configuration tools and processes. That remains to be seen.

Ultimately, I find it hard to reconcile the dichotomy between the fact that computers are so complex that even what we think of as "simple configuration" is unfathomably complicated, and the fact that we seem to be able to keep things running. (Although "keeping things running" is a trillion dollar business, and those things running are often "surprise!" encrypted by ransomware gangs). I don't know how to explain it. Countless lines of code and masses of hardware, all riddled with bugs... and yet the stock market still works. Most of the time. Maybe that's why we don't look back, we're just afraid to find out what's really going on under the hood. Maybe one day we won't have a choice?

👍
Thank you for reading TIDAL SERIES. Please consider forwarding it on to your friends!

Subscribe to Tidal Series by Curtis Collicutt

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe