When we imagine how to use a resource effectively – be that resource a development team, a CPU core, or a port-a-potty – our thoughts usually turn to efficiency. Ideally, the resource gets used at 100% of its capacity: we have enough capacity to serve our needs without generating queues, but not so much that we’re wasting money on idle resources. In practice there are spikes and lulls in traffic, so we should provision enough capacity to handle those spikes when they arrive, but we should always try to minimize the amount of capacity that’s sitting idle.

Except what I just said is bullshit.

In the early chapters of Donald G. Reinertsen’s brain-curdlingly rich Principles of Product Development Flow, I learned a very important and counterintuitive lesson about queueing theory that puts the lie to this naïve aspiration to efficiency-above-all-else. I want to share it with you, because once you understand it you will see the consequences everywhere.

Queueing theory?

Queueing theory is an unreasonably effective discipline that deals with systems in which tasks take time to get processed, and if there are no processors available then a task has to wait its turn in a queue. Sound familiar? That’s because queueing theory can be used to study basically anything.

In its easiest-to-consume form, queueing theory tells us about average quantities in the steady state of a queueing system. Suppose you’re managing a small supermarket with 3 checkout lines. Customers take different, unpredictable amounts of time to finish their shopping. So they arrive at the checkout line at different intervals. We call the interval between two customers reaching the checkout line the arrival interval.

And customers also take different, unpredictable amounts of time to get checked out. The time it takes from when the cashier scans a customer’s first item to when they finish checking that customer out is called the processing time.

Each of these quantities has some variability in it and can’t be predicted in advance for a particular customer. But you can empirically determine the probability distribution of these quantities:

Given just the information we’ve stated so far, queueing theory can answer a lot of questions about your supermarket. Questions like:

How long on average will a customer have to wait to check out?
What proportion of customers will arrive at the checkout counter without having to wait in line?
Can you get away with pulling an employee off one of the registers to go stock shelves? And if you do that, how will you know when you need to re-staff that register?

These sorts of questions are super important in all sorts of systems, and queueing theory provides a shockingly generalizable framework for answering them. Here’s an important theme that shows up in a huge variety of queueing systems:

The closer you get to full capacity utilization, the longer your queues get. If you’re using 100% of capacity all time, your queues grow to infinity.

This is counterintuitive but absolutely true, so let’s think through it.

What happens when you have no idle capacity

What the hell? Isn’t using capacity efficiently how you’re supposed to get rid of queues? Well yes, but it doesn’t work if you do it all the time. You need some buffer capacity.

Let’s think about a generic queueing system with 5 processors. This system’s manager is all about efficiency, so the system operates at 100% capacity all the time. No idle time. That’s ideal, right?

Sure, okay, now what happens when a task gets completed? If we want to make sure we’re always operating at 100% capacity, then there needs to be a task waiting behind that one. Otherwise we’d end up with an idle processor. So our queueing system must look more like this:

In order to operate at 100% capacity all the time, we need to have at least as many tasks queued as there are processors. But wait! That means that when another new task arrives, it has to get in line behind those other tasks in the queue! Here’s what our system might look like a little while later:

Some queues may be longer than others, but no queue is ever empty. This forces the total number of items in the queue to grow without limit. Eventually our system will look like this:

If you don’t quite believe it, I don’t blame you. Go back through the logic and convince yourself. It took me a while to absorb the idea too.

What this means for teams

You can think of a team as a queueing system. Tasks arrive in your queue at random intervals, and they take unpredictable amounts of time to complete. Each member of the team is a processor, and when everybody’s working as hard as they can, the system is at 100% capacity.

That’s what a Taylorist manager would want: everybody working as hard as they can, all the time, with no waste of capacity. But as we’ve seen, in any system with variability, that’s an unachievable goal. The closer you get to full capacity utilization, the faster your queues grow. The longer your queues are, the longer the average task waits in the queue before getting done. It gets bad real fast:

So there are very serious costs to pushing your capacity too hard for too long:

Your queues get longer, which itself is demotivating. People are less effective when they don’t feel that their work is making a difference (see The Progress Principle)
The average wait time between a task arriving and a getting done rises linearly with queue length. With long wait times, you hemorrhage value: you commit time and energy to ideas that might not be relevant anymore by the time you get around to them (again: read the crap out of Principles of Product Development Flow)
Since you’re already operating at or near full capacity, you can’t even deploy extra capacity to knock those queues down: it becomes basically impossible to ever get rid of them.
The increased wait time in your ticket queue creates long feedback times, nullifying the benefit of agile techniques.

Efficiency isn’t the holy grail

Any queueing system operating at full capacity is gonna build up giant queues. That includes your team. What should you do about it?

Just by being aware that this relationship exists, you can gain a lot of intuition about team dynamics. What I’m taking away from it is this: There’s a tradeoff between how fast your team gets planned work done and how long it takes your team to get around to tasks. This changes the way I think about side projects, and makes me want to find the sweet spot. Let me know what you take away from it.

Jeopardy players all know the most common response in the history of the show: “What is Australia?” It’s appeared over 220 times.

But, if you’re looking for the best topics to study as a potential Jeopardy contestant, the number of times a clue has appeared isn’t necessarily what you’re interested in. Suppose clues with the response “What is Australia?” are only ever worth $200, and clues with the response “What is Burundi?” are only ever worth $2000. In that case, as long as the probability of Burundi appearing in a given show is more than 1/10 the probability of Australia appearing, you’re better off, moneywise, brushing up on your Burundi facts.

This is the utility of the expectation value. It’s the probability of a given event multiplied by the value of that event happening. Here’s another way to think of it: if you played a chance game a million times (in our example, this would mean playing a million rounds of Jeopardy), the expectation value of betting on a given outcome (in our example, of studying a given country) is the average amount of money you’d win.

I want to be on Jeopardy, so to help myself prioritize what facts to learn, I calculated the expectation values of knowing every distinct response ever used in Jeopardy (courtesy of http://www.j-archive.com). Here’s my method:

The probability of a response appearing in a given game is the number of times that response has ever appeared, divided by the total number of clues in history, times 60 (Final Jeopardy is ignored). NOTE: See the comments for a discussion of whether this method is valid. Answer seems to be “pretty much, because the probability of any given answer appearing in a show is so miniscule.”
The value of giving a correct response is adjusted for modern clue values ($200 to $1000 in the Jeopardy round, $400 to $2000 in the Double Jeopardy round)
We add up all the adjusted values of a response’s appearance and divide by the number of occurrences to get that response’s average value, and then we multiply by its probability of appearance.

It ends up being a pretty different list! Here are the top 1000 Jeopardy! answers by expectation value: Link to Gist

Australia is still number one, but compare with the top 1000 by count: Link to Gist. There are 228 entries on each list that are missing from the other, and the order of the shared items is very different, especially further down the list.

If you’re going to study for Jeopardy, studying things in order of decreasing expectation value strikes me as more intelligent than studying in order of decreasing historical count. What do you think?

Dan Slimmon

Tag: probability

When efficiency hurts more than it helps

Queueing theory?

What happens when you have no idle capacity

What this means for teams

Efficiency isn’t the holy grail

Calculating the expectation value of knowing Jeopardy! answers