Maybe you’ve been here:
You get a phone call in the middle of the night. The new sysadmin (whom you hired straight out of college) is flipping all of her shits because web app performance has degraded beyond the alert threshold. She’s been clicking through page after page of graphs, checking application logs all the way up and down the stack, and just generally cussing up a storm because she can’t find the source of the issue. You open your laptop, navigate straight to overall performance graphs, drill down to database graphs, see a pattern that looks like mutex contention, log in to the database, find the offending queries, and report them to the on-call dev. You do all this in a matter of minutes.
Or here:
You’re trying to teach your dad to play Mario Kart. It’s like “Okay, go forward… no, forward… you have to press the gas – no, that’s fire – press the gas button… it’s the A button… the blue one… Yeah, there you go, okay, you’re going forward now… so… so go around the corner… why’d you stop? Dad… it’s like driving a car, you can’t turn if you’re stopped… so remember, gas is A… which is the blue one…”
Why is it so hard for experts to understand the novice experience? Well, in his book Sources of Power, decision-making researcher Gary Klein presents some really interesting theories about what makes experts experts. His theories give us insight into the communication barriers between novices and experts, which can make us better teachers and better learners.
Mental Simulation
Klein arrived at his decision-making model, the recognition-primed decision model, by interviewing hundreds of experts over several years. According to his research, experts in a huge variety of fields rely on mental simulation. In Sources of Power, he defines mental simulation as:
the ability to imagine people and objects consciously and to transform those people and objects through several transitions, finally picturing them in a different way than at the start.
Klein has never studied sysadmins, but when I read about his model I recognized it immediately. This is what we do when we’re trying to reason out how a problem got started, and it’s also how we figure out how to fix it. In our head, we have a model of the system in which the problem lives. Our model consists of some set of moving parts that go through transitions from one state to another.
If you and your friend are trying to figure out how to get a couch around a corner in your stairwell, your moving parts are the couch, your body, and your friend’s body. If you’re trying to figure out how a database table got corrupted, your moving parts might be the web app, the database’s storage engine, and the file system buffer. You envision a series of transitions from one state to the next. If those transitions don’t get you from the initial state to the final state then you tweak your simulation and try again until you get a solution.
Here’s the thing, though: we’re people. Our brains have a severely limited amount of working memory. In his interviews with experts about their decision making processes, Klein found that there was a pretty hard upper limit on the complexity of our mental simulations:
- 3 moving parts
- 6 transitions
That’s about all we get, regardless of our experience or intelligence. So how do experts mentally simulate so much more effectively than novices?
Abstractions
As we gain experience in a domain, we start to see how the pieces fit together. As we notice more and more causal patterns, we build a mental bank of abstractions. An abstraction is a kind of abbreviation that stands in for a set of transitions or moving parts that usually functions as a whole. It’s like the keyboard of a piano: when the piano’s working correctly, we don’t have to think about the Rube Golberg-esque series of yanks and shoves going on inside it; we press a key, and the corresponding note comes out.
Experts have access to a huge mental bank of abstractions. Novices don’t yet. This makes experts more efficient at creating mental simulations.
When you’re first learning to drive a car, you have to do everything step by step. You don’t have the abstraction bank of an experienced driver. When the driving instructor tells you to back out of a parking space, your procedure looks something like this:
- Make sure foot is on brake pedal
- Shift into reverse
- Release brake enough to get rolling
- Turn steering wheel (which direction is it when I’m in reverse?)
- Put foot back on brake pedal
- Shift into drive
It’s a choppy, nerve-racking sequence of individual steps. But once you practice this a dozen times or so, you start to build some useful abstractions. Your procedure for backing out of a parking space becomes more like:
- Go backward (you no longer think about how you need to break, shift, and release the brake)
- Get facing the right direction
- Go forward
Once you’ve done it a hundred times, it’s just one step: “Back out of the parking space.”
Now if you recall that problem solving involves mental simulations with at most 3 moving parts and 6 transitions, you’ll see why abstractions are so critical to the making of an expert. Whereas a novice requires several transitions to represent a process, an expert might only need one. The right choice of abstraction allows the expert to hold a much richer simulation in mind, which improves their effectiveness in predicting outcomes and diagnosing problems.
Counterfactuals
Klein highlights another important difference between experts and novices: experts can readily process counterfactuals: explanations and predictions that are inconsistent with the data. This is how experts are able to improvise in unexpected situations.
Imagine that you’re troubleshooting a spate of improper 403 responses from a web app that you admin. You expect that the permissions on some cache directory got borked in the last deploy, so you log in to one of the web servers and tail the access log to see which requests in particular are generating 403s. But you can’t find a single log entry with a 403 error code! You refresh the app a few times in your browser, and sure enough you get a 403 response. But the log file still shows 200 after 200. What’s going on?
If you were a novice, you might just say “That’s impossible” and throw up your hands. But an experienced sysadmin could imagine any number of plausible scenarios to accommodate this counterfactual:
- You logged in to staging instead of production
- The 403s are only coming from one of the web servers, and it’s not the one you logged in to
- 403s are being generated by the load balancer before the requests ever make it to the web servers
- What you’re looking at in your browser is actually a 200 response with a body that says “403 Forbidden”
Why are experts able to adjust so fluidly to counterfactuals while novices aren’t?
It comes back to abstractions. When experts see something that doesn’t match expectations, they can easily recognize which abstraction is leaking. They understand what’s going on inside the piano, so when they expect a tink but hear a plunk, they can seamlessly jump to a lower level of abstraction and generate a new mental simulation that explains the discrepancy.
Empathizing with novices
By understanding a little about the relationship between abstractions and expertise, we can teach ourselves to see problems from a novice’s perspective. Rather than getting frustrated and taking over, we can try some different strategies:
- Tell stories. When Gary Klein and his research team want to understand an expert’s thought process, they don’t use questionnaires or ask the expert to make a flow chart or anything artificial like that. The most effective way to get inside an expert’s thought process is to listen to their stories. So when you’re teaching a novice how to reason about a system, try thinking of an interesting and surprising troubleshooting experience you’ve had with that system before, and tell that story.
- Use the Socratic method. Novices need practice at juggling abstractions and digesting counterfactuals. When a novice is describing their mental model of a problem or a potential path forward, ask a hypothetical question or two and watch the gears turn. Questions like “You saw Q happen because of P, but what are some ways we could’ve gotten to Q without P?” or “You expect that changing A will have an effect on B, but what would it mean if you changed A and there was no effect on B?” will challenge the novice to bounce between different layers of abstraction like an expert does.
- Remember: your boss may be a novice. Take a moment to look around your org chart and find the nearest novice; it may be above you. Even if your boss used to do your job, they’re a manager now. They may be rusty at dealing with the abstractions you use every day. When your boss is asking for a situation report or an explanation for some decision you made, keep in mind the power of narratives and counterfactuals.
Pingback: ddx/common ground/chatops | Empiricism Ops
Pingback: Troubleshooting On A Distributed Team Without Losing Common Ground | Empiricism Ops