Why Continuous Delivery? – Stephen Nimmo

The need for continuous delivery is predicated on the fact that software is continuously deteriorating. From the second the code is checked into source control, entropy begins to set in. Some of this is related to the fundamental nature of the building blocks of software while others are related to a set of ever changing environmental needs from the organization. The sooner an organization fully accepts the truth that software is never done, the sooner they can begin to put systems into place to manage the chaos.

What’s Changing?

Software is made of components of other software. It’s a huge set of building blocks. Each one of those building blocks are subject to change. Here’s a set of the software changes which have significant effects.

Runtime/Language versioning (Java, Python, .NET, etc)
Dependency and Framework versioning (Quarkus, Spring, FastAPI, rest-assured, log4j, etc)
Tool versioning (Tekton, Github Actions, Jenkins, JMeter, Selenium)

These components change for the same reasons that your software changes. New functionality is added. Old functionality is fixed. Security issues are resolved. The reality is that your software changes are subject to not only your organization’s changes, but are subject to changes from a very wide array of environmental concerns.

So why is this a big deal?

Software versioning moves in a particular way, especially in the open-source world for dependencies. Most of the dependencies used in mainstream development do a very good job respecting the laws around semantic versioning and for the most part respecting the major version rule around not introducing breaking changes. But there is a very subtle piece to their delivery that causes the chaos in the way minor versions can be very short lived.

Let’s say your software uses a dependency, version 3.1.0, and everything works fine. All of the sudden, six months later, a huge zero-day bug is found in the library and your company is exposed. This needs to get patched ASAP. The maintainers aren’t necessarily going to go back and pick up version 3.1.0 and patch from there. They are going to pick up 3.2.1 and patch that. Why? Because that version has the bug fixes and new functionality from 3.1.1, 3.2.0 and 3.2.1. So now you are forced to very quickly incorporate this new version into your application to fix the zero-day defect exposure but you are now subject to not only that change, but also the changes from three other releases as well, including a minor version update.

This example is for a single library. Most projects I work with include dozens of different dependencies, some with their own dependencies, forming a huge tree of possible change. When the project doesn’t regularly update it’s dependencies, it increases it’s exposure to these changes. Increasing exposure to change threatens the stability of the application. The longer the time horizon, the greater the threat. If you combine the these dependency exposures with those exposures from frameworks and add in runtime/language changes, the real answer becomes continuous delivery.

What’s the Answer?

The problem is solved through technical capabilities combined with two very prescriptive organizational behavior changes. Let’s start with the story – we should be able to update the dependencies on our application and release a new version of the application with very little operational risk. What does this look like?

First, it’s about testing – specifically automated regression testing. Remember, the story is just about updating dependencies, not adding any additional features. The first technical capability would require us to be able to pull down the code, update the dependencies and then be able to run a set of automated tests which would validate that nothing broke. And maybe it might require a bit of integration testing, if we are truly doing the deployment into pre-production environments prior to production. The key here is that the capability needs to work under pressure! Can you update the dependencies and validate it didn’t break anything with a huge security hole hanging over your head and the CIO on the conference call asking when this is going to be fixed?

However, our first goal is to reduce the exposure. This is where our first organizational behavior change comes into play. The old rule of “If it ain’t broke, don’t fix it” needs to change. In fact, if you are working at a company in which any deployment to production is wholly owned and approved by the business, then the first step is a conversation. The business needs to get on board with approving releases with only technical changes. This requires education in terms of explaining why we would want to do a release with just dependency updates.

This type of change would also require reestablishing trust between the business and the IT organization. The reason these types of organizational rules exist in the first place is because at some point, the business lost the confidence in the software teams to be able to deliver changes without discrete oversight. At some point, there was one too many releases where lapses in quality resulted in outages and lost business. Sometimes the need for control was completely justified due to a lack of engineering discipline. Other times, the software teams became the scapegoat for low quality requirements or poor performance due to extenuating circumstances. Regardless, if the business is unwilling to release a new version of software which has been completely validated by organizationally accepted automated testing practices, then reestablishing that trust will be a crucial link to unlocking the value of continuous delivery.

The second organizational behavior change in rooted in the team behavior. When should these dependencies be updated? If the team is working in an agile model, then the beginning of each sprint is a great time to do the maintenance. The product owner could even put an item into every sprint to perform the work. Again, the goal is to be able to clone the repository, update the dependencies, and run the regression test suite (or whatever your team calls it) that comes back with a big green checkmark that says nothing broke. Commit the change, pull request it and mark the card complete.

What’s Changing?

So why is this a big deal?

What’s the Answer?

Leave a Reply Cancel reply