In the middle of agile project run according to the SCRUM process, if the team uses Story Points for estimation and tracks its Velocity, predicting the project completion date should be an easy task. There are however few questions you still have to answer, in order to get the date you can announce to the project sponsors. This article is about one of these questions.
Imagine you are running such a project. The team has completed a number of sprints already, so you have lots of historical data. Things have been stable for quite a while. The easies approach to predicting the completion date would be to calculate an average velocity from the past few sprints, and divide your remaining backlog size by this velocity. This calculation will give you the number of remaining sprints. Let’s assume for example, that the average velocity is 10 Story Points and the backlog size is 100 SP. The result is 100/10=10. So you need 10 more sprints, to deliver the whole backlog.
The intuition however may suggest to you, that your estimate should not be a single number, but rather a range, e.g. 9-11 sprints, 8-12 sprints etc. This is in line with the Cone of uncertainty concept, which describes the evolution of the amount of uncertainty during a project. In the middle of a project you definitely still have some uncertainty left, therefore the estimate should be a range, not a single number.
This leads to the next challenge: how to specify the range? There are many ways you can do this, e.g.
- You can guess: e.g. I have no idea how big the range should be, therefore I have to assume something. I’ll apply +/-15% factor. In our scenario you would get a range of 8.5-11.5 sprints. You apply a rounding type of your choice, and you have a range expressed in full sprints.
- You can use your company standards: e.g. I have no idea, but in my company in such case we always apply +/- 15%. The remainder of the calculation goes as above.
- You can make an educated guess: without getting into too much details, it’s also about guessing.
All above methods share a few things in common. First thing is that these all require arbitrary choices, e.g. why 15% and not 30%, why my company uses 15% not 30% etc. This may cause you feeling less confident about the result, than you would like. Estimation is uncertainty in itself, so applying arbitrary factors to it may feel uncomfortable. The other thing is that you may deal with a project sponsor, who would like to understand, why you assumed +/-15%. Making the guess an educated one does not always help the case.
Recently I was facing exactly the same challenge in my programme. We had lots of past sprints data, but we were struggling to agree the range size with the client. After few iterations of guessing and talking with the sponsor, we tried a different approach. We used a Monte Carlo method to predict how many sprints we are going to need. Below are the steps we have undertaken.
1. When scientific is better than educated
|Historical velocity [SP]||10.5||13||8||9||12||6||10||12||10|
2. We generated a large number (600) of project simulations. Each simulation had 20 sprints and each sprint velocity was randomly selected from the above past velocities. The rationale for this is, that we assume that future sprints will be similar to past sprints. The simulations looked more or less like this:
So e.g. in simulation 1, the first sprint velocity (12) was randomly selected from the past velocities sequence (10.5, 13, 8, 9, 12, 6, 10, 12, 10). The second sprint was another random selection – 9, the third one was another random selection – 10, etc. In each simulation we could can see how big part of the backlog could be delivered in one, two, three, etc. up to twenty sprints. We could also see, how many sprints would be required in each simulation, to deliver the whole backlog.
3. We then plot a distribution chart showing what percentage of these simulated projects delivered the backlog of 100 Story Points in not more than 7, 8, … 14 sprints. The chart looked like below.
The conclusion was, that over 90% of the randomly generated projects completed in no more than 11 sprints, and approx. 5% of projects completed in less than 9 sprints. Basing on this, we assessed, that with a high level of confidence we need 9-11 more sprints to deliver the backlog. Would the curve be flatter, it would require more discussion with the sponsors about their attitude to risk in order to assess the spread of the upper and lower estimates.
When we explained the reasoning to the project sponsor, we walked through steps 1 to 4 in exactly the same order. And guess what, we weren’t asked any questions we couldn’t answer. This helped us to take our case forward and increase our sponsor’s awareness of why the project estimate should be a range and how big the range could be.
Analogous approach can be used to e.g. assess how big a backlog can be delivered in a given number of sprints, in case the time, rather than scope is the constraint of your project. Care should be taken when you have a long list of known risks in your project. The method described does not take this into account. However, the method weakest assumption is that the velocities you will get in the future sprints will be similar to the ones you had in the project. It’s in other words saying, that the future will look similar to the past.