In dedicated my last article to expose the pervasive effect of maintenance cost on product development speed. (I recommend reading it first to further understand the concepts discussed in this one)
But it would not be complete without some recommendations on how to keep it as low as possible. Let’s review 5 options:
1. DevOps and Automation
A big part of the maintenance efforts any team has to endure is related to working on the production environment.
I believe that it is quite easy to observe the impact of the DevOps efforts and mindset in the software construction phase, but it also has a profound impact on the maintenance cost that it may be overlooked.
Since DevOps focus is on the automation of repetitive construction and maintenance activities (deployment, configuration, testing, telemetry, alerts, and self-healing, among others) it is almost obvious to understand that it will reduce the maintenance cost.
Let’s see a quick example for each:
- When deployments are not automated, it is not only costly to deploy, but also more likely that errors can occur. This implies that bugs will appear in production, they will be hard to track (for instance in one server some configuration was miscopied, creating an incorrect behaviour that looks random when seeing the group of N servers your application runs on).
- When testing is not automated, and you have to maintain an aged and complex system, you need to run an exhaustive battery of regression tests that consume a lot of time in each maintenance event.
- When telemetry and alerts are not automated, you have a fixed cost of a high number of people looking 7×24 at the system metrics. Due to poor telemetry, you will also likely have a high number of user-generated complains, that will probably translate in loss of revenues.
- When your system does not have self-healing mechanisms, each system degradation will require the involvement of the team, increasing exponentially the maintenance cost.
DevOps is likely the most powerful tool to reduce maintenance cost and constantly keep it low.
2. Ruthlessly Simplify your Product
Probably due to unawareness of the maintenance cost, almost no one would ever request to remove a feature (including 99% of product managers).
Since everyone ignores the time it takes to maintain a piece of software, it would sound as if removing something would have a negative impact on the business without having any particular positive impact on the team’s performance.
For that reason, we keep in the product some features that have almost no use or no business impact. We would say “Yes, it only represents 1% of revenue, but better 1% than 0%!”.
I hope by now you see that this is quite wrong reasoning. Consider a feature that took 2 months to complete and after launch had very low impact. Even when you decide to do nothing about it, it will consume at least another 1.5 months on the team’s time in future maintenance (considering the average 75% maintenance cost we reviewed in the last article).
If instead of killing that feature you spend another 2 months writing another feature, now you have 3 months of maintenance cost ahead of you… and so the cycle continues.
You are slowly killing your team’s ability to add value.
Considering all the factors that exponentially add complexity to the product as it grows, each piece of software and lines of code you are able to remove make a huge difference in reducing the needed effort to maintain the things that do add value.
Simplify your product as much as possible and let your team add value by avoiding spending a single maintenance minute in the things that don’t.
(And this value quantification is without considering the beneficial impact for the user that have a simpler product!)
3. Architecture supporting team autonomy and low maintenance costs
One of the huge advantages of modern modular software architectures is to reduce dependencies and maintenance cost.
In previous decades, the software would be built as big unified pieces known as monoliths. This implied (among other things) that whenever a change was introduced in a portion of the system it could have unpredicted effects everywhere. Likely, when a large team was working on the software, there were low chances to work independently and when something was finished, large integration and test periods were required that increased the cost per release.
Of course, there are many implications of the modular paradigm change, but we can specifically focus on its impact on maintenance cost.
It basically tackles down the “exponential factor” of scope and complexity growth mentioned in the last article. A well defined modular architecture allows for:
- Less effort of deploy and testing: each maintenance event will probably require to affect only a single module, so the cost of testing and deploy will be reduced to just that module instead of the entire system.
- Dependencies: a huge pain in large systems is when more than one team needs to be involved to solve a particular maintenance event. You need to sync priorities, deployment windows, testing efforts, etcetera. This is costly and particularly slow (thus increasing the cost of delay). Good modular architectures allow for team isolation and autonomy, letting them work on their parts of the system without needing/interrupting others.
- Less cost of analysis: instead of the high analysis cost in old complex systems, where finding what and where the code needed to be modified was slow and painful, in modular systems it is much faster to analyze the particular module in charge of the change we need to affect.
- Less cost of “new use cases”: in a modular system, since modules are reused, whenever you need to add a small new use case, it will automatically replicate to all areas of the system when you update the module in charge of it.
There are probably many more, but to go even further, for smaller modules the automation mentioned in the DevOps sections are easier to achieve, so combining those two factors you get great reductions in how much time the teams spend working on production issues.
4. Containers and containers handlers
When we consider the types of maintenance events, there are 2 that are more frequent and can be lowered to literally cero-cost with containers:
- Environment changes (like OS patches)
Containers have many advantages, but let’s again focus on the ones related to maintenance.
Note: I’ll not make a proper introduction to the subject, to further understand what a container is, I recommend you read the documentation of the most widespread technology that supports them: Docker.
“A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another” — Docker documentation.
Containers wrap your application and let you run it without having to worry about the “context” they are run on. By doing so, you isolate the possibility of the first maintenance type mentioned, since there aren’t any unmanaged environment changes. You define your OS and whatever you need to run your code inside the package, and it doesn’t change without the container image being rebuilt.
On the other hand, even when you always need to keep an eye on your application performance for scalability, containers make it really easy to have horizontal scaling. Combined with a cloud environment, containers make it quite easy and fast to spawn new copies of your application and have the load distributed among all your instances.
5. Isolate external interfaces or other high maintenance pieces of your system
This is probably an extension of the modular architecture benefit.
If you have external interfaces and services (where maintenance events are more unpredictable) or parts of your system that are more prone to frequent maintenance, it would be a good practice to isolate them.
Going back to all the benefits of modular systems, whenever you need to have those maintenances, make them as painless as possible by having just to update a small module.
Even further, if you need to choose somewhere to start with the mentioned DevOps automation, this would be a great place to reap the benefits of the reduced maintenance cost faster.
0. Making it visible
“What gets measured gets managed” — Peter Druker
I would like to wrap up by stating the obvious but mostly ignored: The first step to control the maintenance cost is making this type of work visible and start measuring its magnitude.
As discussed in the last article, there are many maintenance work “types”, and those are usually measured separately. They get to the backlog as completely different items (for instance bugs versus scalability refactors). So in order to measure total maintenance effort versus total innovation effort, you should somehow tag these items and later on review the total amount of maintenance versus non-maintenance items (either as the count of items or total effort if you track hours, story points or any other effort sizing). A simple way to do it is using whatever you use to track work, like Jira, Microsoft TFS, or any system that supports tagging tasks and later making reports on those tags.
Measuring it is the only way to see how bad your problem is 🙂