When we imagine how to use a resource effectively – be that resource a development team, a CPU core, or a port-a-potty – our thoughts usually turn to efficiency. Ideally, the resource gets used at 100% of its capacity: we have enough capacity to serve our needs without generating queues, but not so much that we’re wasting money on idle resources. In practice there are spikes and lulls in traffic, so we should provision enough capacity to handle those spikes when they arrive, but we should always try to minimize the amount of capacity that’s sitting idle.
Except what I just said is bullshit.
In the early chapters of Donald G. Reinertsen’s brain-curdlingly rich Principles of Product Development Flow, I learned a very important and counterintuitive lesson about queueing theory that puts the lie to this naïve aspiration to efficiency-above-all-else. I want to share it with you, because once you understand it you will see the consequences everywhere.
Queueing theory is an unreasonably effective discipline that deals with systems in which tasks take time to get processed, and if there are no processors available then a task has to wait its turn in a queue. Sound familiar? That’s because queueing theory can be used to study basically anything.
In its easiest-to-consume form, queueing theory tells us about average quantities in the steady state of a queueing system. Suppose you’re managing a small supermarket with 3 checkout lines. Customers take different, unpredictable amounts of time to finish their shopping. So they arrive at the checkout line at different intervals. We call the interval between two customers reaching the checkout line the arrival interval.
And customers also take different, unpredictable amounts of time to get checked out. The time it takes from when the cashier scans a customer’s first item to when they finish checking that customer out is called the processing time.
Each of these quantities has some variability in it and can’t be predicted in advance for a particular customer. But you can empirically determine the probability distribution of these quantities:
Given just the information we’ve stated so far, queueing theory can answer a lot of questions about your supermarket. Questions like:
- How long on average will a customer have to wait to check out?
- What proportion of customers will arrive at the checkout counter without having to wait in line?
- Can you get away with pulling an employee off one of the registers to go stock shelves? And if you do that, how will you know when you need to re-staff that register?
These sorts of questions are super important in all sorts of systems, and queueing theory provides a shockingly generalizable framework for answering them. Here’s an important theme that shows up in a huge variety of queueing systems:
The closer you get to full capacity utilization, the longer your queues get. If you’re using 100% of capacity all time, your queues grow to infinity.
This is counterintuitive but absolutely true, so let’s think through it.
What happens when you have no idle capacity
What the hell? Isn’t using capacity efficiently how you’re supposed to get rid of queues? Well yes, but it doesn’t work if you do it all the time. You need some buffer capacity.
Let’s think about a generic queueing system with 5 processors. This system’s manager is all about efficiency, so the system operates at 100% capacity all the time. No idle time. That’s ideal, right?
Sure, okay, now what happens when a task gets completed? If we want to make sure we’re always operating at 100% capacity, then there needs to be a task waiting behind that one. Otherwise we’d end up with an idle processor. So our queueing system must look more like this:
In order to operate at 100% capacity all the time, we need to have at least as many tasks queued as there are processors. But wait! That means that when another new task arrives, it has to get in line behind those other tasks in the queue! Here’s what our system might look like a little while later:
Some queues may be longer than others, but no queue is ever empty. This forces the total number of items in the queue to grow without limit. Eventually our system will look like this:
If you don’t quite believe it, I don’t blame you. Go back through the logic and convince yourself. It took me a while to absorb the idea too.
What this means for teams
You can think of a team as a queueing system. Tasks arrive in your queue at random intervals, and they take unpredictable amounts of time to complete. Each member of the team is a processor, and when everybody’s working as hard as they can, the system is at 100% capacity.
That’s what a Taylorist manager would want: everybody working as hard as they can, all the time, with no waste of capacity. But as we’ve seen, in any system with variability, that’s an unachievable goal. The closer you get to full capacity utilization, the faster your queues grow. The longer your queues are, the longer the average task waits in the queue before getting done. It gets bad real fast:
So there are very serious costs to pushing your capacity too hard for too long:
- Your queues get longer, which itself is demotivating. People are less effective when they don’t feel that their work is making a difference (see The Progress Principle)
- The average wait time between a task arriving and a getting done rises linearly with queue length. With long wait times, you hemorrhage value: you commit time and energy to ideas that might not be relevant anymore by the time you get around to them (again: read the crap out of Principles of Product Development Flow)
- Since you’re already operating at or near full capacity, you can’t even deploy extra capacity to knock those queues down: it becomes basically impossible to ever get rid of them.
- The increased wait time in your ticket queue creates long feedback times, nullifying the benefit of agile techniques.
Efficiency isn’t the holy grail
Any queueing system operating at full capacity is gonna build up giant queues. That includes your team. What should you do about it?
Just by being aware that this relationship exists, you can gain a lot of intuition about team dynamics. What I’m taking away from it is this: There’s a tradeoff between how fast your team gets planned work done and how long it takes your team to get around to tasks. This changes the way I think about side projects, and makes me want to find the sweet spot. Let me know what you take away from it.