A couple weeks ago, my wife & I went to Basilica Block Party, a local music festival. It was a good time, and OH MANS you have to see Fitz & The Tantrums live. Their sax player is a hero unit.

Anyhoo, we walked over to the porta-potties between sets. The lines were about 8–10 people long. And my spouse suggested an intriguing strategy for minimizing her wait time. I will call this strategy The Strategy:

The Strategy: All else being equal, get in the line with the most men.

The reasoning behind The Strategy is obvious: women take longer to pee than men, so if 2 queues are the same length, then the faster-moving queue should be the one with fewer women. It’s intuitive, but due to my current obsession with queueing theory, I became intensely interested in the strategy’s implications. In particular, I started to wonder things like:

How much can you expect to shave off your wait time by following The Strategy?
How does the effectiveness of The Strategy vary with its popularity? Little’s Law tells us that the overall average wait time won’t be affected, but is The Strategy still effective if 10% of the crowd is using it? 25%? 90%?

And then I thought to myself “I could answer these questions through the magic of computation!”

qsim

Lately I’ve been seeing queueing systems everywhere I go, so I figured it’d be worthwhile to write a generic queueing system simulator to satisfy my curiosity. That’s what I did. It’s called qsim.

To quote the README,

A queueing system in qsim processes arbitrary jobs and is composed of 5 pieces:

The arrival process controls how often jobs enter the system.

The arrival behavior defines what happens when a new job arrives. When the arrival process generates a new job, the arrival behavior either sends it straight to a processor or appends it to a queue.

Queues are simply holding pens for jobs. A system may have many queues associated with different processors.

A queueing discipline defines the relationship between queues and processors. It’s responsible for choosing the next job to process and assigning that job to a processor.

Processors are the entities that remove jobs from the system. A processor may take differing amounts of time to process different jobs. Once a job has been processed, it leaves the queueing system.

qsim provides a framework for implementing these building blocks and putting them together, and it also provides hooks that can be used to gather data about a simulation as it runs. I’m really looking forward to using qsim to gain insight into all sorts of different systems.

For now: porta-potties.

The porta-potty simulation

You’ll recall The Strategy:

The Strategy: All else being equal, get in the line with the most men.

To determine the effectiveness of The Strategy, I implemented PortaPottySystem using qsim. Here are some of the assumptions I made:

People arrive very frequently, but if all the queues are too long (8 people) they leave.
There are 15 porta-potties, each with its own queue. Once a person enters a queue, they stay in that queue until the corresponding porta-potty is vacant.
Shockingly, I couldn’t find any reliable data on the empirical distribution of pee times by sex, so I chose a normal distribution with a mean of 40 seconds for men and 60 seconds for women.
Most people just pick a random queue to join (as long as it’s no longer than the shortest queue), but some people use The Strategy of getting into the queue with the highest man:woman ratio (again, as long as it’s no longer than the shortest queue).
Nobody’s going number 2 because that’s gross.
Everyone is either a man or a woman. I know all about gender being a spectrum, and if you want to submit a pull request that smashes the gender binary, please do.

The first question I wanted to answer was: how does using The Strategy affect your wait time?

To answer this question, I ran a simulation where the probability of a given person deciding to use The Strategy is 1%. The other 99% of people simply join one of the shortest queues without regard to its man:woman ratio. I ran 20 simulations, each for the equivalent of 2 weeks, and came up with these wait time distributions:

More of a box-plot person? I hear ya:

The Strategy is definitely not a huge win here. On average, your wait time will be reduced by about 10–15 seconds (4–6%) if you use The Strategy. Still, it’s not nothing, right?

Now how does the benefit of using The Strategy vary with its popularity? This is actually really interesting. I never would have guessed it. But the data shows that you should always use The Strategy, even if everybody else is using it too.

Here I’ve charted average wait times against the proportion of people using the strategy. Colors are more prominent where the data set in question is large (and therefore heavily influences the overall average):

You’ll notice that the overall average (dark green) does not vary with Strategy popularity. This is good news, because otherwise we’d be violating Little’s Law, which would probably just mean our simulation was broken.

The interesting thing here is that the benefit of using The Strategy decreases pretty much linearly as its popularity increases, but at the same time there accrues a disadvantage to not using it. If everybody else in the system is using The Strategy, and you come along and decide not to, you can still expect to wait 10 seconds longer than everybody else. Therefore using The Strategy is unequivocally better than not using The Strategy.

Unless of course you don’t really care about those 10 seconds, in which case you should do whatever you want.

When we imagine how to use a resource effectively – be that resource a development team, a CPU core, or a port-a-potty – our thoughts usually turn to efficiency. Ideally, the resource gets used at 100% of its capacity: we have enough capacity to serve our needs without generating queues, but not so much that we’re wasting money on idle resources. In practice there are spikes and lulls in traffic, so we should provision enough capacity to handle those spikes when they arrive, but we should always try to minimize the amount of capacity that’s sitting idle.

Except what I just said is bullshit.

In the early chapters of Donald G. Reinertsen’s brain-curdlingly rich Principles of Product Development Flow, I learned a very important and counterintuitive lesson about queueing theory that puts the lie to this naïve aspiration to efficiency-above-all-else. I want to share it with you, because once you understand it you will see the consequences everywhere.

Queueing theory?

Queueing theory is an unreasonably effective discipline that deals with systems in which tasks take time to get processed, and if there are no processors available then a task has to wait its turn in a queue. Sound familiar? That’s because queueing theory can be used to study basically anything.

In its easiest-to-consume form, queueing theory tells us about average quantities in the steady state of a queueing system. Suppose you’re managing a small supermarket with 3 checkout lines. Customers take different, unpredictable amounts of time to finish their shopping. So they arrive at the checkout line at different intervals. We call the interval between two customers reaching the checkout line the arrival interval.

And customers also take different, unpredictable amounts of time to get checked out. The time it takes from when the cashier scans a customer’s first item to when they finish checking that customer out is called the processing time.

Each of these quantities has some variability in it and can’t be predicted in advance for a particular customer. But you can empirically determine the probability distribution of these quantities:

Given just the information we’ve stated so far, queueing theory can answer a lot of questions about your supermarket. Questions like:

How long on average will a customer have to wait to check out?
What proportion of customers will arrive at the checkout counter without having to wait in line?
Can you get away with pulling an employee off one of the registers to go stock shelves? And if you do that, how will you know when you need to re-staff that register?

These sorts of questions are super important in all sorts of systems, and queueing theory provides a shockingly generalizable framework for answering them. Here’s an important theme that shows up in a huge variety of queueing systems:

The closer you get to full capacity utilization, the longer your queues get. If you’re using 100% of capacity all time, your queues grow to infinity.

This is counterintuitive but absolutely true, so let’s think through it.

What happens when you have no idle capacity

What the hell? Isn’t using capacity efficiently how you’re supposed to get rid of queues? Well yes, but it doesn’t work if you do it all the time. You need some buffer capacity.

Let’s think about a generic queueing system with 5 processors. This system’s manager is all about efficiency, so the system operates at 100% capacity all the time. No idle time. That’s ideal, right?

Sure, okay, now what happens when a task gets completed? If we want to make sure we’re always operating at 100% capacity, then there needs to be a task waiting behind that one. Otherwise we’d end up with an idle processor. So our queueing system must look more like this:

In order to operate at 100% capacity all the time, we need to have at least as many tasks queued as there are processors. But wait! That means that when another new task arrives, it has to get in line behind those other tasks in the queue! Here’s what our system might look like a little while later:

Some queues may be longer than others, but no queue is ever empty. This forces the total number of items in the queue to grow without limit. Eventually our system will look like this:

If you don’t quite believe it, I don’t blame you. Go back through the logic and convince yourself. It took me a while to absorb the idea too.

What this means for teams

You can think of a team as a queueing system. Tasks arrive in your queue at random intervals, and they take unpredictable amounts of time to complete. Each member of the team is a processor, and when everybody’s working as hard as they can, the system is at 100% capacity.

That’s what a Taylorist manager would want: everybody working as hard as they can, all the time, with no waste of capacity. But as we’ve seen, in any system with variability, that’s an unachievable goal. The closer you get to full capacity utilization, the faster your queues grow. The longer your queues are, the longer the average task waits in the queue before getting done. It gets bad real fast:

So there are very serious costs to pushing your capacity too hard for too long:

Your queues get longer, which itself is demotivating. People are less effective when they don’t feel that their work is making a difference (see The Progress Principle)
The average wait time between a task arriving and a getting done rises linearly with queue length. With long wait times, you hemorrhage value: you commit time and energy to ideas that might not be relevant anymore by the time you get around to them (again: read the crap out of Principles of Product Development Flow)
Since you’re already operating at or near full capacity, you can’t even deploy extra capacity to knock those queues down: it becomes basically impossible to ever get rid of them.
The increased wait time in your ticket queue creates long feedback times, nullifying the benefit of agile techniques.

Efficiency isn’t the holy grail

Any queueing system operating at full capacity is gonna build up giant queues. That includes your team. What should you do about it?

Just by being aware that this relationship exists, you can gain a lot of intuition about team dynamics. What I’m taking away from it is this: There’s a tradeoff between how fast your team gets planned work done and how long it takes your team to get around to tasks. This changes the way I think about side projects, and makes me want to find the sweet spot. Let me know what you take away from it.

Dan Slimmon

Month: July 2015

Minding Your Pees And Queues