Disclaimer: I’m a Terraform fanboy (despite a fairly dry year or two), so this ramble is very Terraform-centric. Other Configuration Management tools are available.

Configuration Management is a Good Thing

Just in case anyone needs a quick refresher on why configuration management is a Good Thing, here are my highlights:

  • It forces process - changes to configuration are controlled through PR gates and pipelines. In addition, you get all the useful webhooks you can trigger from your repositories, like event notifications.
  • There’s a record of what’s there - “documentation as code” is much maligned (and often rightly so), but for a lot of configuration the big picture is easy to put together from the core pieces. Design your APIs right with custom extensions and modules, and that complexity can be reasonable to handle too. Terraform has a graph subcommand that can produce a diagram of your infrastructure.
  • There’s a record of what’s happened - Git history, pipeline logs etc. all show who did what and when. Reverting commits can quickly return things to the last known good state. Reports can be generated.

Not only are these great to have for just keeping track of what you’re running, it also helps onboarding new staff. Reading documentation is boring - reading and potentially playing about with configuration management (not in prod, obviously!) is much more engaging.

It’s not just marshalling other people’s computers

Configuration management goes beyond traditional cloud infrastructure. There are plenty of providers for Terraform that let you define how you want to configure things like your Twilio subscription, Grafana installations, or almost anything that has an API. And if there isn’t a provider, the tools to make your own are readily available.

This is great because it allows us to get all the benefits listed above but applied to SaaS services. No more lengthy procedures for changes, no more configuration drift.

However these providers are often the source of one of the main criticisms levelled at Terraform: the delay between a feature being available in an API and the feature being available as part of the provider. This can be especially true for burgeoning services - companies will be rolling out features quicker than the one dev who’s currently trying to maintain the Terraform provider in his spare time can cope with.

So what’s the problem?

The issue I’ve seen a few times in the wild is a reluctance to get a service’s configuration defined in something like Terraform when the provider coverage might not be complete. The argument is that you can end up stuck in a netherworld of needing new configuration management processes and tooling alongside traditional documented steps for manual changes. This can be extra confusing for people new to working with the service that’s being implemented, as figuring out how to change any given component isn’t just a matter of “update the code”.

However, if you do adopt that shiny new Terraform provider that’s being supplied by a company to help you manage their SaaS offering, you’re helping them iron out bugs, improve documentation, and prioritise features. If the provider improves, then adoption improves, and the feedback loop is established.

Sometimes the missing components or resources are the big hairy features like SSO, but these are often day-zero “one and done” configuration jobs. So long as you document the manual changes made, it shouldn’t be a barrier to defining other parts of the service in code.

Conclusion

This all essentially boils down to the old adage that “perfection is the enemy of good”. Some configuration management alongside a few hacks or well-docmented manual steps is better than only manual steps. The later you pay that technical debt, the more expensive it will be.