When Story Points Don’t Work

Posted by Keith McMillan

February 17, 2016 | Leave a Comment

Ah, story points! The darling child of Scrum-style estimation. They have a lot to recommend them, in particular because of the many problems with using hours (ideal or otherwise) to estimate tasks. There are however some unspoken assumptions around using story points, which could catch the inexperienced or incautious team and cause significant problems. Do you know what’s assumed when you estimate your work using story points?As a brief recap, a story point is a unit-less measure of the relative complexity of a story. Let’s break that down a bit. By unit-less we mean that a story point is not stated in terms of hours (the main competing unit of estimation). It’s unit is “story point,” and is only meaningful in comparison to other story points in the same backlog (hence relative). For this reason, story points can’t be used to compare one team against another, only to compare the amount of work a given team does over time, and to forecast how much work a team should be reasonably able to undertake. Story points are assigned by the people doing the work (the team) not by some outside entity (e.g. the product owner).

One of the philosophies behind story pointing is that estimates (both in hours and otherwise) are notoriously inaccurate, and so we want to spend as little time as possible arriving at a “good enough” coarse relative measurement of complexity. Once we have the coarse measurement, and a couple of sprints under our belt, we can use velocity (the number of story points the team has been able to complete in the previous few sprints) to determine how much work the team should reasonably be expected to complete. This is the empirical measure that allows us to limit scope in a sprint to what’s achievable, and to forecast future progress. There are some big assumptions baked into this apparent simplicity, and reality is sometimes not that simple…

Assumption 1: Team composition

The first assumption is the composition of the team itself. It’s no secret that agile and Scrum favor a team of broadly skilled individuals, which is another way of saying that everyone ideally should be able to do any task required of the team. With this mentality, a generic “how long will the team need to do this story” works fine.

The reality of the current state of the art is that teams are comprised of individuals with different skills and aptitudes, first among which is developers and testers. Some organizations are working to change this, many are not. If we estimate the amount of work the entire team can commit to based on the velocity, but there the majority of stories are only workable by a subset of the team (i.e. specialists), there’s a risk that we are overloading some specialist resources, and leaving others with nothing to do. Further, if we adhere to the notion that the people doing the work estimate the work, does that mean that specialists only estimate their portion of a story, or the work of the entire team when they estimate a story? (hint: the second one, but are teams thinking this way?)

Assumption 2: Nature and quality of user stories

User stories need to be broken down into a size that allows them to  be picked up and worked to completion within a single sprint. This causes us to decompose stories if they don’t fit within a sprint, and that decomposition can create problems.

While we want the user stories to involve all the “traditional” activities (requirements, analysis, design, test, deployment, user verification, etc) sometimes it simply is not possible within a single sprint. As an example, in organizations that have a longer-running change control process (I’d argue that’s “most big companies”) it’s entirely likely that we may be able to get through development and integration testing, but not through deployment and user acceptance testing as part of a single sprint. That leads us to decompose the story into two: one up through testing, and a second to take the functionality to production and verification. This causes teams problems when they want to compare one story against another, because the activities are dissimilar. We can ask the team to make an estimate of the level of work required to perform each new user story, but if they are dissimilar they may have difficulty arriving at a meaningful relative measurement.

Assumption 3: Work can be performed without interruption

The final assumption I want to talk about is that once we start with a story, we can work on it until it completes. It’s again a useful simplification, but in many cases unrealistic. Dependencies on entities outside of the team (e.g. a vendor answering a support ticket) can cause start-stop-start patterns in working on stories. We ask a team to estimate the level of effort, not the amount of time it takes to complete a task. A one point story with external dependencies may take an entire sprint to complete, where one without may be done in a day. If your backlog is populated with stories with lots of external dependencies, this disrupts the correlation between story points and amount of work we can do in a sprint.

So what to do?

To be clear, I think that story points are the best solution I’ve seen to estimating the stories in a Scrum backlog, and I think velocity is a good yardstick to use to gauge the amount of work a team should undertake. To paraphrase Churchill, these techniques are like Democracy, they’re the worst way to do it, except for all the others we’ve tried so far.

The team should enter with eyes wide open, knowing that story points alone don’t dictate whether they can commit to sprint scope, and that velocity is a guidepost, but not an authority when it comes to the amount of work that should be considered achievable. They need to consider whether the stories can be picked up by anyone on the team, whether the story has external dependencies that are unlikely to be satisfied within the sprint, and the overall quality of the user stories before committing to a sprint backlog.


RSS feed | Trackback URI

Comments »

No comments yet.

Name (required)
E-mail (required - never shown publicly)
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> in your comment.