Infrastructure as Code is an approach that involves describing infrastructure as code and then applying it to make the necessary changes. But how exactly to write code, IaC does not say, it only gives tools. One such tool is Terraform.
No. 1. readability
Infrastructure code must be readable . Then your colleagues will be able to easily understand it, and if necessary, add or test it. This seems to be an elementary thing, but it is often forgotten, and the output is “write only code” – code that can only be written, but cannot be read. Even its author, after a couple of days, is unlikely to be able to understand what he wrote and figure out how it all works.
An example of good practice is to put all variables in a separate file. This is convenient because they do not have to be searched throughout the code. You just open the file and immediately find what you need.
No. 2. Writing style
You need to adhere to a certain style of writing code. For example , the length of a line of code should be between 80-120 characters . If the lines are very long, the editor starts wrapping them. Transfers destroy the overall view and interfere with the understanding of the code. You have to spend a lot of time just to figure out where the line started and where it ended.
It’s good when the code writing check is automated. You can use the CI/CD pipeline for this. One of the steps of such a pipeline should be Lint, a process of statistical analysis of what is written, which helps to identify potential problems even before the code is applied.
No. 3. Working with repositories
Work with repositories like a developer . It is important to develop in new branches, link branches to tasks, review what has already been written, send a Pull Request before making changes, etc.
From the point of view of a maintainer, the listed actions may seem redundant – it is normal practice when people just come and start committing. If you have a small team, this might still work. Although even in this case it will be difficult to understand who, when, why and what corrections were made. As the project develops, such practices will increasingly complicate the understanding of what is happening and interfere with work. Therefore, it is worth learning how development works with repositories.
No. 4. Automation
Infrastructure as Code tools are somehow associated with DevOps. And DevOps are specialists who not only deal with maintenance, but also help developers work: set up pipelines, automate test launches, etc. All this also applies to IaC.
In Infrastructure as Code, automation should be applied : Lint rules, testing, automatic releases, etc.
When we have repositories with the same Ansible or Terraform, but they are rolled out manually – an engineer just comes and starts the task, this is not very good. Firstly, it is difficult to track who, why and at what moment launched it. Secondly, it is impossible to understand how it worked and draw conclusions.
With everything in the repository and controlled by an automatic CI/CD pipeline, we can always see when the pipeline was launched and how it performed. We can control the parallel execution of pipelines, identify the causes of failures, quickly find errors, and much more.
No. 5. Testing
You can often hear from maintainers that they do not test the code at all or simply first run it somewhere on dev. This is not the best testing option, because it does not give any guarantee that dev matches prod. In the case of Ansible or other configuration tools, standard testing looks something like this:
- launched a test on dev;
- rolled on dev, but crashed with an error;
- fixed this error;
- once again, the test was not run because dev is already in the state to which they tried to bring it.
It seems that the error has been corrected, and you can roll on prod. What will happen to prod? It’s always a matter of luck – hit or miss, guess or miss. If somewhere in the middle, something falls again, the error will be corrected and everything will be restarted.
But infrastructure code can and should be tested . At the same time, even if specialists know about different testing methods, they still cannot use them. The reason is that Ansible roles or Terraform files are written without the initial focus on the fact that they will need to be tested somehow.
When a developer writes code, he thinks about what else needs to be tested. And, accordingly, before starting to write code, he plans how he will test it. Untested code is low quality code.
The same applies to infrastructure code: once written, you should be able to test it. Tests reduce the number of errors and make it easier for colleagues who will finalize your roles on Ansible or Terraform files.
A few words about automation in the end
A common practice when working with Ansible is that even if something can be tested, there is no automation. Usually this is a story when someone creates a virtual machine, takes some role written by colleagues, and launches it. Then he thinks: you need to add this and that. Appends and launches again on the virtual machine. Then he realizes that some more changes are needed, and that the current virtual machine has already been brought to some kind of state, so it needs to be killed, a new virtual machine should be raised and the role rolled over it. And if something does not work, this algorithm will have to be repeated until all errors are eliminated.
Usually the human factor works, and after the nth number of repetitions, it becomes too lazy to delete the virtual machine and create it again. Everything seems to work exactly as it should this time, so you can freeze the changes and roll through prod. But in reality, errors can still occur, which is why automation is needed. It works on automatic pipelines and signals new Pull Requests, helps to quickly identify bugs and prevent their occurrence.