# Calculating the expectation value of knowing Jeopardy! answers

Jeopardy players all know the most common response in the history of the show: “What is Australia?” It’s appeared over 220 times.

But, if you’re looking for the best topics to study as a potential Jeopardy contestant, the number of times a clue has appeared isn’t necessarily what you’re interested in. Suppose clues with the response “What is Australia?” are only ever worth \$200, and clues with the response “What is Burundi?” are only ever worth \$2000. In that case, as long as the probability of Burundi appearing in a given show is more than 1/10 the probability of Australia appearing, you’re better off, moneywise, brushing up on your Burundi facts.

This is the utility of the expectation value. It’s the probability of a given event multiplied by the value of that event happening. Here’s another way to think of it: if you played a chance game a million times (in our example, this would mean playing a million rounds of Jeopardy), the expectation value of betting on a given outcome (in our example, of studying a given country) is the average amount of money you’d win.

I want to be on Jeopardy, so to help myself prioritize what facts to learn, I calculated the expectation values of knowing every distinct response ever used in Jeopardy (courtesy of http://www.j-archive.com). Here’s my method:

• The probability of a response appearing in a given game is the number of times that response has ever appeared, divided by the total number of clues in history, times 60 (Final Jeopardy is ignored). NOTE: See the comments for a discussion of whether this method is valid. Answer seems to be “pretty much, because the probability of any given answer appearing in a show is so miniscule.”
• The value of giving a correct response is adjusted for modern clue values (\$200 to \$1000 in the Jeopardy round, \$400 to \$2000 in the Double Jeopardy round)
• We add up all the adjusted values of a response’s appearance and divide by the number of occurrences to get that response’s average value, and then we multiply by its probability of appearance.

It ends up being a pretty different list! Here are the top 1000 Jeopardy! answers by expectation value: Link to Gist

Australia is still number one, but compare with the top 1000 by count: Link to Gist. There are 228 entries on each list that are missing from the other, and the order of the shared items is very different, especially further down the list.

If you’re going to study for Jeopardy, studying things in order of decreasing expectation value strikes me as more intelligent than studying in order of decreasing historical count. What do you think?

## 6 thoughts on “Calculating the expectation value of knowing Jeopardy! answers”

1. This is the most my probability background has ever paid off

“The probability of a response appearing in a given game is the number of times that response has ever appeared, divided by the total number of clues in history, times 60 (Final Jeopardy is ignored).”

This method of calculating the probability is incorrect – you’re actually calculating the expected value of that response if the payoff were \$60. The probability of an event happening at some point during n chances is the complement of the probability that it never happens during n chances, or P(happening at some point) = 1 – P(never happening). The probability it won’t happen on a given question is the complement of your initial calculation, or P(doesn’t happen on 1 question) = 1-(# times clue was correct / total number of clues). Since there are 60 questions, this must happen 60 times for the topic to never appear during that show, and this happens with probability P(doesn’t happen on 1 question)^60. So, P(happening at some point) = 1-(P(doesn’t happen on 1 question)^60).

1. danslimmon

Thanks for the correction. It’s a good point.

Although there are on the order of 100,000 distinct answers in the history of the show, and therefore the probability of any given answer being the response to a clue is very small. And for p << 1, you can use Taylor series to approximate

(1 – p)^n

as

1 – np

I think that helps my case, right? It's been a long time since I did serious math.

1. This is the most my probability background has ever paid off

Yeah it’s a close approximation. It likely wouldn’t change your rankings much, if at all.

2. Joseph Nebus

I beg pardon for asking an exceedingly obvious point but I wanted to be certain I understood it. You are considering the value of an incorrect response to be zero; that is, if the contestant notices that Australia can’t be the response, she has the good sense not to answer at all rather than get it wrong and lose whatever the response’s value was?

1. danslimmon

Good point. Yes, what I’ve calculated is the value of knowing, every time the answer is “Australia”, that the answer is “Australia”, and then ringing in and answering; otherwise you keep your mouth shut. The calculation assumes that all answers you give are independent.