Reducing the Impact of Broken Builds

by Jason Sankey on Tuesday, 05 December 2006

Introduction

The "build" is the current status of any software project, and as such reflects the health, vitality and progress of the project. Defined simply, the build is the result of checking out, compiling and testing the latest source code for the project. The build is broken when this process fails, due to failed integration, compilation or testing. The impact of a broken build is far reaching, affecting everything from high level planning to day-to-day development.

In this article we first review some of the impacts of a broken build. Those already familiar with the negative impacts of broken builds may wish to skip this lead in. There follows an analysis of various techniques to reduce the frequency and impact of broken builds. These techniques vary from the optimistic to the preventative, and from lightweight to quite intensive.

Impacts of a Broken Build

Day to Day Development Bottelnecks

Broken builds directly hinder the ability of a software developer to work on the project. When the build is broken, a developer cannot reliably test their current work. In the most severe case, where the build does not even compile, a developer is all but paralysed. To progress any further, the developer must stop what they are doing to debug the build itself. Often several team members will be blocked in this way simultaneously.

Long Checkin/Update Cycles

A developer used to working in an environment where the build is frequently broken will often seek to isolate themselves from the problem. To do so, they will delay updating to the latest source code for as long as possible, so they can work without interruption. This in itself leads to integration problems: the longer the period between updates, the further the developer diverges from the main code line. When the time comes to update, the developer can be stumped for days sorting out integration issues.

Broken Window Syndrome

When the build is broken, it can be hard or impossible to tell when a new change degrades it even further. Without means to recognise the degradation, developers will unknowingly submit broken changes. Even worse: faced with this situation some developers may stop caring if their changes work. With no accountability for degrading the build, there is less motivation to test properly.

Planning Problems

When the build is broken, the project manager has no realistic idea of the project's progress. Although developers may report good progress on individual tasks, there is no way to know how well the final product is coming together. Failure to recognise that a software product is more than the sum of its features leads to "Integration Hell" at the end of the project.

Low Morale

At a higher level, the combination of the above factors leads to a drop in team morale. Working under the shadow of a broken build is difficult and frustrating. Without the ability to measure progress, team members may feel they are getting nowhere fast. This can lead to a severe downward spiral in productivity.

Reducing the Impact

Continuous Integration

Continuous integration is the practice of automatically building and testing the latest copy of your project source code many times a day (often after every new change). The key thing this gives you in the fight against broken builds is early feedback. As soon as the build is broken, your team knows about it. Couple this with a policy of immediately repairing broken builds, and your build will be healthy the vast majority of the time. Note also that a build is typically easier to fix just after it is broken, as the cause of the failure is easier to identify and fresh in the mind.

Continuous integration is the key practice for fighting broken builds. The practices listed below are best used in conjunction with this fast-feedback system.

The Safe Update

This technique can be used to reduce the impact of a temporary build breakage on day to day development. It also alleviates the fear of updating that can lead to long update/checkin cycles.

With continuous integration in place, you can readily identify the status of your build over time. In particular, you can identify the last time your build passed, and the corresponding revision of source code. This is the "last known good" revision of your source code, and is a safe point for developers to update to when synchronising their local development workspace. Using a capable continuous integration server you should be able to automate tagging of the last known good revision. Then developers can safely synchronise by updating to that tag.

The best attributes of this technique are that it is highly automatable and very lightweight. It has little impact on the ways developers work. The drawbacks are that it is not in any way preventative (the latest source can still be broken), and developers may not be able to check in their changes until the build is fixed.

Developer Branches

This technique may be used to effectively control the impact of one developer's changes on all other developers. It leverages features of source control servers to more carefully control the integration of changes.

Even with a continuous integration system in place, the build can be broken. New changes are first checked in to source control, and then tested. These untested changes within the shared source can negatively impact the team. One way to ensure this does not happen is to to isolate developers from one another. Instead of sharing a single code line, each developer works on their own source code branch. When a developer makes a change, they first check it in to their branch. Here it can be tested by the continuous integration server. If all goes well, the change can be merged down to a shared development trunk. The change only meets the shared trunk after it has been tested, largely preventing the possibility of a broken build. To update, the developer first merges up from the shared trunk, then performs a normal update.

The advantages of this system are tight control of change integration (given a capable source control server), and the ability to prevent broken builds by testing before code is shared. The disadvantage is that this system is heavyweight. It requires a change to the normal working pattern, and requires strong merging support from the source control server. It is worth noting that for very large teams, this technique can be useful to split development up into manageable groups. In this case small teams are given their own branch (not individual developers), and the larger team shares a single trunk.

Pre-Commit Verification

This technique can be used to strictly enforce a working build before a change reaches the shared code line.

Many source control servers support the notion of a pre-commit trigger: a special command that runs before a change is committed. These commands usually have access to the new change and are able to veto the commit if desired. In a continuous integration context, a pre-commit trigger could be used to force a build and test before a change is accepted. If the build fails, the change is rejected and it is up to the developer to fix the problem. In this way the shared source code is kept free of breaking changes.

The advantages of this technique are the ability to prevent broken builds and strict enforcement. The primary disadvantage is the difficulty in implementation without affecting productivity in other ways. Forcing tests in a trigger can be a bottleneck for commits: while a build is running for one commit other commits will be blocked. Further, the strictness can itself be too heavy-handed.

Personal Builds

This final technique attempts to capture the best of all worlds by providing a way to prevent broken builds without a heavyweight process or other bottlenecks.

Developer branches have the appealing aspect of being able to test a change before it is integrated into the shared code line. However, this process is heavyweight as it still relies on the code being checked in before testing, requiring the use of branches. This can be avoided if there is a way to submit a change directly to the continuous integration server for testing before it is committed. We call this technique personal builds, as it enables developers to test their own changes. If a personal build is successfully, the developer can check in confident that their code has been tested. If it fails, the developer is free to fix the problem and re-test before committing.

The advantage of this technique is the ability to prevent broken builds without enforcing a new process. Developers work as normal, but at any time can call upon the continuous integration server to test their changes for them. The developers also have the freedom to decide when a personal build is appropriate.

Conclusion

It is quite clear that a broken build is bad news for software development productivity. However, the practice of continuous integration, along with some advanced techniques, can reduce the impact significantly. In reality, broken builds will still happen, but it is possible to reduce their frequency and ensure that very few team members are affected.

Out of the advanced techniques described, there is no one size fits all solution. Our personal preference is a lightweight combination of safe updates, personal builds and a bit of common sense. We find this allows us to work the way we prefer while at the same time greatly reducing the frequency and impact of broken builds. Other situations, however, may call for the tighter control of developer branches or pre-commit verification. In all cases, we believe continuous integration is vitally important to maintaining productivity throughout the development life cycle.

Shameless Plug

If the techniques described in this article interest you, you might want to give our continuous integration server, Pulse, a try. It supports the lightweight techniques we favour (personal builds are new in Pulse 1.2), and can also be used in conjunction with other techniques.

Return to articles index.