18 June 2012

The ideal iteration length, part 1

In the Confluence development team at Atlassian, we’ve played around with the length of iterations and release cycles a fair bit. We’ve always had the goal to keep them short, but just how short has varied over time and with interesting results.

The first thing you need to define when discussing iteration lengths is what constitutes an iteration. I define it as follows:

An iteration is the amount of time required to implement some improvements to the product and make them ready for a customer to use.

There are various areas of flexibility in this definition that will depend on what your team does and who the customer is. For some teams, the “customer” may be other staff inside the organisation, where you’re preparing an internal release for them each iteration. For some teams, the definition of “improvements” might need to be small enough that only a little bit of functionality is implemented each time.

In every case, an iteration has to have a deliverable, and ideally that deliverable should be a working piece of software which is complete and ready to use.

On top of the typically short “iteration cycle”, we have a longer “release cycle” for our products at Atlassian. This is to give features some time to mature through use internally, and helps us try out new ideas over a period of a few months before deciding whether something is ready to ship to our 10,000 or so customers.

Long (multi-month) iterations

When I first started at Atlassian in 2006, the release process for the team was built around a release with new features every 3—4 months. There were no real deliverables from the team along the way to building this, so in practice this was the iteration length. Occasionally, just prior to a new release, we’d prepare a beta release for customers to try out. But that was an irregular occurrence and not something we did as a matter of course.

There were a few problems with this approach:

  • the team didn’t have regular feedback on their progress
  • it was hard for internal stakeholders to see how feature development was progressing
  • features would often take longer than planned, requiring the planned released date to be pushed back.

You could say that the first two points actually led to the third, since the team and the management had little idea of their overall progress, it was easy for planned release dates to slip at the last minute.

Late in 2007, we tried to address these problems by introducing regular iterations with deliverables into our process.

Two-week iterations

Here’s what our team’s development manager wrote to the company when we started building a release of our software every two weeks and deploying it to our intranet wiki, called Extranet:

We are releasing Milestone releases to EAC every two weeks, usually on Wednesdays. This means that EAC always runs the latest hottest stuff, keeping everyone in the loop about what we are currently developing. Releasing regularly also helps the development team focussing on delivering production-ready software all the time - not just at the end of a release cycle. We aim at always providing top quality releases, and we are certainly not abusing EAC as a QA-center.

Along with this was a process for people to report issues, and some new upgrade and rollback procedures that we needed to make this feasible.

Basically, our team moved into a fairly strict two-week cycle for feature development. Every two weeks, we’d ensure all the features under development were in a stable enough state to build a “milestone” build. This milestone would be deployed to our intranet and made available to our customers via an “early access programme” (EAP).

Initially, this took a lot of work. When building features earlier, on a longer iteration cycle, we’d often be tempted to take the entire feature apart during development then put it back together over a period of months. This simply doesn’t work with two-week iterations, where the product needs to be up-and-running on a very important internal instance every two weeks.

The change was mostly one of culture, however. As we encouraged splitting up the features into smaller chunks which were achievable in two weeks, the process of building features this way became entrenched in the team. The conversations changed from “how are we going to get this ready in time?” to “what part of the feature should we build first?”

This two-weekly rhythm gave us the following benefits over a longer iteration period:

  • the team had a simple deadline to work towards — make sure your work is ready to ship on every second Wednesday
  • features were available early during development for the entire organisation to try out
  • issues with new features were identified sooner by the wider testing
  • the release process was standardised and practiced regularly
  • customers and plugin developers got access to our new code for testing purposes sooner
  • releases tend to hit on or very close to their planned dates, with reduction in scope when a given feature wasn’t going to be ready in time.

However, there also seemed to be some drawbacks with our new two-week iteration process:

  • large architectural changes struggled to get scheduled
  • large changes that couldn’t be shipped partially complete (like the backend conversion from wiki markup to XHTML) had to be done on a long-lived branch
  • the focus on short-term deliverables seemed to detract from longer term discussions like “are we actually building the right thing?”

Looking at each of these problems in detail, however, showed that none of them are actually directly related to the iteration length. They are actually problems with our development process that needed to be solved independently. The solutions to them probably deserve their own posts, so I’ll leave those topics for the moment.

As I mentioned above, there were some prerequisites for us to get to this point:

  • We needed a customer who was okay receiving changes frequently. It might take some convincing that releasing more frequently is better for them, but in the long run it really is!
  • We needed a process for communicating those changes: we published “milestone release notes” to the organisation every two weeks with the code changes.
  • We needed to standardise and document a milestone release and deployment process, which ideally as similar as possible to the full release process, but might take a few expedient shortcuts.
  • The software has to actually be ready to go each fortnight. This might need some badgering and nagging of the dev team to get them to wrap up their tasks a few days beforehand.
  • Lastly, we needed to assign a person responsible for doing the build and getting it live every two weeks. This role rotated among the developers in the team.

Our two-week iteration cycle served us extremely well in the Confluence team. We continued on this two-weekly rhythm of building and shipping milestones to our internal wiki and selected releases externally for testing for more than three years.

To be continued…

That’s it for today’s post. Next time, I’ll take a look at how we’ve attempted to decrease our iteration length further and what the results of that effort have been.

If you’d like to know when my next article is published, you can follow me (@mryall) on Twitter.