This is a guest post from Cameron Pavey, draft.dev.
Software development teams are always looking for ways to move faster and deliver more value in less time. However, one common pitfall for many teams is spending far too much time and effort building something, only to encounter major issues late in the project’s lifecycle.
The fail-fast strategy addresses this problem by “shifting left” on potential points of failure and embracing them as part of a virtuous feedback loop. Whether in project management or software development, a failure is often just a signal that something needs to change. The earlier you can detect this signal, the sooner you can adjust for it, thus mitigating the risk of wasted work.
To properly implement a fail-fast strategy, you need a tool that supports this way of working. TeamCity is a CI/CD solution that complements the fail-fast strategy and has several features to help you implement it.
In this guide, you’ll learn more about the fail fast strategy and how you can leverage TeamCity to adopt this strategy for yourself.
What is the fail-fast strategy?
In software development, the fail-fast strategy emphasizes iterative discovery over strict planning. Key principles include:
- Rapid feedback loops: Short cycles allow for frequent releases and early feedback, helping teams move quickly and efficiently.
- Proactive risk management: Early identification of risks allows for timely mitigation or pivoting, reducing wasted effort.
- Iterative experimentation: Frequent experimentation helps quickly identify dead ends and promising solutions, optimizing the use of resources.
- Transparency: Promotes a culture of continuous improvement, where teams share and learn openly.
This strategy operates on multiple levels:
- Project management: Focuses on direction, ideation, and finding product-market fit.
- Software development: Applies the same principles to concrete practices like continuous integration (CI).
Pairing the fail-fast mentality with a CI service like TeamCity helps shorten feedback cycles and improve outcomes by rapidly identifying and addressing issues.
Major advantages of the fail-fast strategy
Compared to traditional methods like the waterfall strategy, the fail-fast strategy offers significant benefits:
- Shorter feedback loop: Submit small, incremental changes for automated checks, catching faults early and fixing them quickly.
- Identifying pitfalls early on: Incremental builds help identify technical issues early, allowing for timely adjustments and de-risking your approach upfront.
Potential drawbacks of the fail-fast strategy
While the fail-fast strategy offers many benefits, there are potential drawbacks:
- Mindset shift: Teams used to traditional methods may resist the concept of “failure”, impacting morale and causing frustration.
- Embracing failure: Understanding that failure, discovery, and pivoting are key elements is crucial for the strategy’s success.
- Tooling requirements: Effective implementation requires robust tools, particularly a CI/CD tool like TeamCity.
How does TeamCity support the fail-fast strategy?
As the central goal of the fail-fast strategy is shorter feedback cycles, you need a tool that provides this feedback for you in a timely manner. Continuous integration tools, such as TeamCity, are designed for this purpose. They can run builds, tests, and other scripted workflows in response to code changes.
Real-time reporting
Large software projects can have long, slow-running builds. This makes it especially frustrating when you wait a long time for a build to run, only to discover that it failed. Whether it’s a blocking build issue that stops subsequent steps or an early test failure that makes the rest of the run redundant – the sooner you discover these issues, the better.
TeamCity mitigates this issue for you by providing real-time reporting. You can view in-progress builds, inspect their current state, and build logs to identify problems as they occur rather than waiting until the end of a CI run.
By seeing the live status of running builds, developers can identify and fix problems without waiting for the build to finish. When the build fails, you can see what went wrong, fix it, and run another build. Compared to other CI systems where you need to wait for the build to complete, this workflow offers a shorter feedback cycle that works well with the fail-fast strategy.
This approach is particularly effective if quick-running build steps provide high-value feedback, like static analysis and other code quality checks. You can view the progress of these live to determine if you need to make any alterations to your code, and once these pass, you can leave the rest of the build to run.
Build configurations
Your build process will likely be nontrivial when dealing with complex software systems. You’ll need to do any number of steps, including:
- Building Docker images
- Compiling source code
- Downloading dependencies
- Running various kinds of tests (with varying degrees of cost)
As your build grows over time, it could become too large to reasonably manage. At this point, you may want to find a way to split the build into smaller steps.
TeamCity solves this through build configurations. Build configurations allow you to split your build into discrete steps. When you do this, each step has a clearly defined responsibility, limiting the potential for complexity to leak between steps as your system grows.
Build chains
Another helpful feature of TeamCity that can be used to implement the fail-fast strategy is build chains. This feature allows you to declare build configurations as dependent on one another. In practice, this means you can run all your quality gates before the deployment step, allowing you to skip the deployment if there are quality issues that prevent it from being a release candidate.
Failures in the earlier steps in the chain will stop subsequent build configurations from running. This can save time and resources and help shorten the feedback loop even more by avoiding the effort spent on faulty builds.
Test reports
Failing tests are a fact of life for software developers. The key factor that separates frustrating failures from helpful failures is how much information you have when trying to fix them. Ideally, you want to know:
– What failed: Was it a unit test, an integration test, etc.?
– When it failed: Is it a new failure? Is this the first time it has happened? Is it a recurring flaky test?
– Why it failed: Is it a legitimate functional failure or a flaky test?
This information helps you narrow down the cause and promptly fix the issue. You could run the test on your development machine and see it fail for yourself, but you’ll likely miss out on a lot of context (for example, information about in which commit this failure first arose).
TeamCity test reports solve this problem. Every time your CI workflow runs, test data is captured for a wide range of testing frameworks. This data is then presented to you in relation to the CI runs that have experienced test failures, as well as a few other views like Current Problems and Flaky Tests.
These reports provide immediate insights into what problems your build is facing and the nature of the problems, such as flaky tests, newly introduced issues, or long-standing failures.
High-quality test reports are a must for projects that require heavy use of automated testing at any level, but they’re especially helpful if you have broad coverage from unit tests and integration tests.
This way, at a glance, you have a comprehensive snapshot of the state of your code base each time CI runs.
Notifications
Fail fast is only good if you know about the failure, and detailed build information is only helpful if people know it’s there. Developers would typically prefer to work on things themselves rather than sit and watch something being built by CI.
Thanks to highly configurable notifications, there’s no need to babysit builds. In TeamCity, you can configure rules to determine what you would like to be notified about and where you’d like those notifications to go. From email and browser notifications to Slack and even in-IDE notifications, there are several channels to choose from.
Notifications are a key requirement in a system that’s intended to help you work more proactively, and the more configuration options you have at your disposal, the more use cases you will be able to satisfy.
For a fail-fast workflow, you might want to configure notifications for any failed build to which you’ve contributed. Then, relying on the VCS integration, you can get rapid feedback on your changes directly in your IDE as you make small, atomic changes.
TeamCity notifications can be configured to only notify you of Builds containing my changes or when The first build error occurs. These settings are great for fine-tuning the notifications you see. Rather than seeing every failing build, you might only want to see failures on builds that include your changes or when the first error happens for a build.
If you set up continuous development through TeamCity, you can also enable notifications to inform you whenever a deployment or infrastructure change (through infrastructure as code tools like Terraform or Kubernetes deployments) occurs. In this case, you’d likely want to be notified in case of success as well as failure.
Artifacts
Issues often arise in CI that you cannot replicate locally. This can lead to a lot of misdirected time as you try to determine which is different between the CI run and your local application. Using artifacts can help address this problem.
In TeamCity, artifacts are typically anything produced by your build, such as binaries, logs, recordings, screenshots, etc. You can treat pretty much anything as an artifact, which gives you great flexibility in how you use this feature. Artifacts are then captured by TeamCity and are available for download through the UI after the build.
This can greatly streamline the analysis and debugging process that you go through when trying to rectify a failing build. For example, if you have end-to-end (E2E) tests that only seem to fail in your CI runs, there’s a good chance that your E2E tool has the ability to produce screenshots and screen recordings when failures occur. Being able to capture these as artifacts gives you a trove of data to help with debugging.
Artifacts can also be used for any other use case where you want to capture the output of a build. Perhaps your project produces binary executables. In that case, you could capture the built binary for each CI run, allowing you to test any build for any commit that runs through your CI workflow.
Wrapping up
This guide introduced the fail-fast strategy, including its benefits, such as proactive risk management, transparency, and adaptability. TeamCity supports fail-fast through various powerful features, including real-time reporting, flexible notifications, and detailed test reports.
When utilized properly, the fail-fast strategy can be a powerful tool. It can help you move faster and deliver value without the constraints of slower, more traditional ways of working. However, the process needs to be supported by suitably powerful and flexible tools. If you’re looking for a CI/CD server that fits the bill, consider taking TeamCity for a spin today.