Calculating the expectation value of knowing Jeopardy! answers

Jeopardy players all know the most common response in the history of the show: “What is Australia?” It’s appeared over 220 times.

But, if you’re looking for the best topics to study as a potential Jeopardy contestant, the number of times a clue has appeared isn’t necessarily what you’re interested in. Suppose clues with the response “What is Australia?” are only ever worth $200, and clues with the response “What is Burundi?” are only ever worth $2000. In that case, as long as the probability of Burundi appearing in a given show is more than 1/10 the probability of Australia appearing, you’re better off, moneywise, brushing up on your Burundi facts.

This is the utility of the expectation value. It’s the probability of a given event multiplied by the value of that event happening. Here’s another way to think of it: if you played a chance game a million times (in our example, this would mean playing a million rounds of Jeopardy), the expectation value of betting on a given outcome (in our example, of studying a given country) is the average amount of money you’d win.

I want to be on Jeopardy, so to help myself prioritize what facts to learn, I calculated the expectation values of knowing every distinct response ever used in Jeopardy (courtesy of http://www.j-archive.com). Here’s my method:

  • The probability of a response appearing in a given game is the number of times that response has ever appeared, divided by the total number of clues in history, times 60 (Final Jeopardy is ignored). NOTE: See the comments for a discussion of whether this method is valid. Answer seems to be “pretty much, because the probability of any given answer appearing in a show is so miniscule.”
  • The value of giving a correct response is adjusted for modern clue values ($200 to $1000 in the Jeopardy round, $400 to $2000 in the Double Jeopardy round)
  • We add up all the adjusted values of a response’s appearance and divide by the number of occurrences to get that response’s average value, and then we multiply by its probability of appearance.

It ends up being a pretty different list! Here are the top 1000 Jeopardy! answers by expectation value: Link to Gist

Australia is still number one, but compare with the top 1000 by count: Link to Gist. There are 228 entries on each list that are missing from the other, and the order of the shared items is very different, especially further down the list.

If you’re going to study for Jeopardy, studying things in order of decreasing expectation value strikes me as more intelligent than studying in order of decreasing historical count. What do you think?