
Tech debt as risk of friction
Tech debt is a term that’s used quite a bit in software development, and I recently realised a new way of thinking about it: the risk of future friction. I’ll explain what I mean below, starting with a brief discussion of what I mean about risk in general.
Risk
Before I get into risk as it applies to tech debt, I ought to lay some groundwork for risk in general. I think that it’s helpful to think of risk as having (at least) two dimensions: likelihood and impact.
There is a risk that I will be hit by a falling satellite when I walk to the shops. If it happens, it will probably kill me (the impact will be large), but fortunately it’s highly unlikely to happen (the likelihood is small). There’s also a risk that when I next load the dishwasher there will be too much dirty stuff for it all to fit in. The likelihood is relatively high, but the impact will be very low – particularly compared to being struck by a falling satellite.
As a graph:

If something is both likely and high impact, that’s very bad. If it’s unlikely and low impact, then it’s not so bad. If it’s high in one area and low in the other then it’s medium bad. We’ll return to this shortly, in the context of tech debt.
Tech debt
It can be tempting to think of tech debt as wrongness. It’s how messy the code is, or how long classes and methods are. If tech debt were wrongness, you could imagine some kind of code analyser that looked at source code and gave it a score of how much wrongness i.e. tech debt it had. You’d probably need to extend this to something that measured test quality – coverage (I dislike code coverage being used as anything other than a broad-brush measure), surviving mutants from mutation testing etc.
I think that tech debt is more complicated than just code and test wrongness. It also has contributions from people and external organisations. To illustrate the last one, imagine that I wrote some code that was well-structured and well-named, that’s also well-tested. This would have little to no tech debt.
This code does its job and just sits there for years without needing to be changed. However, over that time, one of the bits of 3rd party code it depends on has had several new versions released, and the version used by my code is no longer supported. The code in isolation is no more wrong than it was when it was written, and its tests are just as good as before. However, because of my company’s interaction with the supplier of the 3rd party code, it has accumulated some tech debt. We will need to spend time and effort migrating our code to use a newer version of the 3rd party code that is supported. This could have no direct benefit to end users, and could be painful and time-consuming if the new versions of the 3rd party code needed changes to our code for the two to fit together.
I’ll now give an example of the contribution of people to tech debt. Imagine that there’s a source file that implements some key business logic. The code is in a bad shape – poorly-named and poorly-structured – and has little or no tests or documentation. Fortunately, it has not needed to be changed since it was written. I think it’s clear that this file has tech debt.
The impact of the tech debt can be increased without changing the file at all:
- Change the file from never needing to be changed, to one that needs to be changed often. I assume this is because it’s for a functional area that product management has decided needs improvement. I also assume that we don’t use the changes to clear up the tech debt e.g. because of time pressure. This means the cost of the tech debt is incurred many times – each time the file is changed.
- If the file is understood by only one person (for instance the original author), then if that person is ill or on holiday the rest of the team will struggle much more than that person to make changes.
Given that tech debt can emerge from things outside the source files, such as organisations and people, I don’t think that just wrongness is a helpful enough way of thinking about it. Instead, I suggest looking at it as the risk of friction.
Tech debt as risk of friction
By friction I mean: something that makes it harder to get work done. For instance, adding a new feature should take 3 days, but before I can do that I need to spend 5 days adding tests and refactoring code, so the 3 day change ends up taking 8 days. Going beyond just friction to risk of friction means we have two dimensions to tech debt:
- How likely is it to make life hard for me?
- How bad will life be if the tech debt does affect me?
Not all tech debt is equal, and these dimensions can help identify the nastiest bits and highlight different tactics for addressing it. We can reduce likelihood, impact or both.
The example above of the poorly-written, -tested and -documented file that’s understood only by the original author can be tackled in various ways:
- The author could train the rest of the team on the file and what it does. The code, tests and documentation would be no better than before, but the rest of the team would have a head start in understanding how to change the file. The impact of the tech debt (extra time) would be reduced, even though its likelihood was just as large.
- The author could write documentation for the file, instead of or as well as training
- Tests could be written that are a close fit for existing behaviour, and act as documentation.
- The code could be split into a few chunks (even though each chunk is just as bad internally as the original file was). This will slightly increase how easy it is to understand the code. It will also reduce the likelihood that a given functional change will touch a given chunk of bad code.
The 3rd party code example given above can be tackled with different tactics. The main one is to not let the version in use fall too far behind that most recent supported version. The larger the jump in version numbers between where we are now and where we need to get to, the more likely it is that painful breaking changes need to be accommodated. The same amount of pain is tackled over the long term, but in several small lumps rather than one large lump.
Also, if we use the oldest supported version, then we have work to do as soon as the other organisation decides to push it out of support by releasing a new version. If we keep a buffer of older still supported versions, then we have more control over when to upgrade.
Who does it affect?
As an aside, I think it’s important to think about who suffers when there’s tech debt. Developers and their colleagues such as managers will suffer. They will make progress more slowly than if there were no tech debt.
However, it’s important to remember that users almost certainly won’t suffer directly, although they will suffer indirectly. Users don’t care how messy some code is as long as it behaves in the way that the user wants. So, directly, they aren’t affected by tech debt. Something that affects users directly is more accurately described as a bug (or a missing feature).
However, there’s an opportunity cost related to tech debt – the time spent dealing with tech debt is time that can’t be spent on developing new features, so the users suffer indirectly due to tech debt.
Tech debt affects developer experience directly, and user experience indirectly.
Conclusion
To sum up: tech debt can be thought of as a risk of friction. Friction makes it harder to make progress on the work you want. Not all tech debt is equally bad – some is unlikely to be a problem at all (e.g. because the file hasn’t been changed in ages), and some won’t be a big problem.
Instead of thinking of tech debt as just wrongness of files, consider all the causes of tech debt, including external dependencies, the passing of time, team skills and availability etc.
This article was inspired by a post on LinkedIn by Michael Drogalis.