I just spent 2 days at Devops Days Minneapolis, the first Devops Days in the Midwest. My brain is still whirling. I was brainstorming while I wrote this; forgive me if it goes astray.
The organizers of the conference (too many to name here) did a super admirable job of building a diverse, empathy-oriented conference while keeping the spotlight on tech. It was really a wonderful conference.
What really strikes me about the fascinating people I’ve met over the last two days is the diversity of their backgrounds and passions. The tech industry attracts all sorts of fascinating people, from philosophy PhDs to roadies who used to tour with Pantera. Our field has become a real melting pot of viewpoints, and that’s a beautiful thing.
However, we’re all doing engineering now. Most of us are doing engineering without any formal engineering training. And I firmly believe that any engineer will be much more effective with a few basic statistical tools in their back pocket.
I proposed an Open Space on this topic, asking the question “Why are so many engineers in tech reluctant to learn any math, despite knowing how useful math can be?” The session attracted a bunch of very smart people, and I came away with many new insights.
We teach how to math, but not why to math
When I give talks to software and ops engineers about math, they eat it right up. They have a strong intuitive sense that math is good and useful and powerful.
On the other hand, when I talk to individuals in tech about math, I’m often shocked at how quickly they shut down. They say things like “Oh, I’m no good at math,” or “I don’t have time to do math,” or “There’s no use for math in what I do.” It smacks of C.P. Snow’s observations on the Two Cultures, but these are people in a career that most humans would place solidly on the analytical side of that intellectual chasm. But chatting at Devops Days reminded me of a really important fact that can help explain this phenomenon.
Math is a part of everyone’s education, but statistics is not.
That’s absurd! Statistics in the single most widely applicable branch of mathematics. You can apply it directly to your life with practical, observable benefits no matter who you are or what you do. Instead, we learn algebra, geometry, trigonometry, and calculus — wonderful, rich disciplines to be sure — but without statistics, you can’t blame students for thinking that math is too abstract and academic to be useful in their lives.
So now, when people inexperienced in math hear “You should learn some math,” they think of impenetrable equations winding across boards, or they draw on “Good Will Hunting” and “A Beautiful Mind” and imagine that they’re being asked to do novel number theory research.
This is one of the biggest misconceptions we need to break. We need to spread the message that statistics is universally applicable, practical, and not very hard to learn with modern tools.
Staring at a blank wall
Local mathematician and Rubyist Kaisa Taipale makes a good point: when you’re new to computers, you worry a lot about making mistakes. The pictures on the screen doesn’t map onto a logical framework, you don’t know the right questions to ask, and as Pagerduty’s David Shackelford puts it, you can’t push past the blankness. Being new to stats is very similar.
In our session, we discussed two ways to remove the hurdles in front of stats-curious engineers. First of all, it’s critical that we make data gathering as trivial as possible. It’s easy to tell someone to look at the data, but when the data’s in the form of million-line Apache logfiles that need to be massaged and crunched and filtered with a custom script, that person will be frustrated and angry before they even start their analysis! At my company, I’ve written scripts to pull data out of Graphite and Logstash into R, but I haven’t been very forthcoming or solicitous about those scripts. I’ll change that.
The other major hurdle to understanding data is the academic undertone of the existing tools. I love me some R, but I’ll be the first to admit that it’s not the friendliest piece of software for someone who’s newly curious about data analysis. We need to spread knowledge of some easier and more accessible stats tools, like Tableau and Statwing, and maybe come up with some open source alternatives.
Once you push past the feeling that you need a PhD to even ask the right questions about data, stats learning becomes a positive feedback loop that benefits everybody in the organization. Let’s keep the discussion going and make sure that the potential data addicts in our midst don’t get discouraged.
You can teach the thought process
Statistical inference, and math in general, can seem like a gift. People think “I just don’t get it. I don’t have the spark for math.”
On the contrary, math is a set of skills that you can learn. They can be taught. In fact, Ian Malpass teaches a course at Etsy called “Graphite Bootcamp,” where interested engineers can learn what an integral means (“it just means adding stuff up,” he says) or how to use time series smoothing techniques.
Math is a tool that can take you from a bunch of opaque, independently useless metrics to a meaningful number that gives you real insight into the system. That’s exactly the mindset that Ian teaches in his class. This is a fantastic idea, and I would love to see other companies copy Ian’s idea and build upon it.
My point here is: a bit of statistics is easy to learn, useful in almost any situation, and beneficial to the level of discourse and decision-making in any organization. If there are barriers stopping engineers from learning it, then those barriers should be attacked furiously until they’re gone. Math still leaves room for arguments over interpretation, but when we’re arguing about data we’re much much more likely to make the right decision.