Frequently Asked Questions

Board Control is the analytics home for Jeopardy — live win probability charts, BUTTREY player ratings, the BRING IT matchup forecaster, Daily Double wagering analysis, and the Excitement Index, updated daily for every game. This page covers how each of those works, plus a few general questions at the end.

Win Probability

See the momentum of every game unfold in real time — who's winning, who's surging, and where the big swings happen. View charts →

What does a win probability chart show?

A win probability chart shows how each contestant's chances of winning change throughout a Jeopardy game, plotted clue by clue. Charts are updated overnight.

Which games are included?

Regular season games plus postseason tournaments: Tournament of Champions, Masters, Champions Wildcard, Second Chance, Invitational Tournament, College Championship, and Teachers Tournament. For tournament games where contestants have no prior appearances, our model uses synthetic baseline statistics so all three players begin with roughly equal win probability. Teen, celebrity, kids week, and power-player tournaments are excluded because they use materially different rules or contestant populations. You can filter by tournament type from the controls at the top of the main charts page.

How are the probabilities calculated?

The probabilities are determined by a machine learning model that prices in the returning champion's stats, current round and questions remaining, daily doubles remaining on the board, which player has control of the board, current scores, and in-game player statistics.

Does this account for Final Jeopardy wagers?

No. The probabilities are calculated before Final Jeopardy wagers are revealed. This is why you'll sometimes see a player with a high probability who ultimately loses due to a surprising wager.

How do I find games with specific players?

Use the search bar at the top of the main page. Type any contestant's name (like "Ken Jennings" or "Amy Schneider") to see all their games. Click any game to view its chart.

Daily Double Analysis

Should they have bet it all? The math behind every Daily Double decision. View DD Analysis →

What is Daily Double wagering analysis?

The Daily Double Analysis page shows whether contestants made optimal wagers on Daily Doubles. Optimal wagers are determined by the changes in win probability if the player gets their question right or wrong for a given wager size, and calculates their expected change in win probability for a given get rate of the question. The process is roughly similar to the 4th down conversion calculators in football.

How do I use the Daily Double analysis tool?

Visit the DD Analysis page and select any game using the season and date dropdowns. Choose a specific Daily Double from the list, and you'll see a chart showing the optimal wager percentage for different get rates. The player's actual wager is shown as a green line, and historical average get rates are shown in red. This reveals whether they wagered too much, too little, or just right.

Player BUTTREY Ratings

A definitive ranking of who's actually good at Jeopardy — based on the numbers, not your gut. View Ratings →

What are BUTTREY ratings?

Power ratings measure each player's relative Coryat scoring strength using a Bradley-Terry model. Every game is broken into head-to-head matchups between pairs of players, and the model estimates how much stronger or weaker each player is compared to the field. A rating of 0.00 represents an average player. Ratings are updated daily as new games air.

How are these ratings calculated?

For every game, each pair of players is treated as a head-to-head matchup, and the "result" is the Coryat score differential between them — how much more one player scored than the other on knowledge alone (Daily Double and Final Jeopardy wagers are stripped out). A Bradley-Terry model fits all these pairwise Coryat differentials simultaneously across thousands of games, producing a single rating per player that best explains the observed margins. Players who consistently out-Coryat their opponents rise up the ratings; players who lose ground slide down.

Do the ratings account for strength of opponents?

Yes. Because every matchup is pairwise, the model automatically weights performance by opponent strength. A 5,000-point Coryat margin over a highly-rated opponent moves your rating more than the same margin over a weak opponent, and losing to a strong opponent hurts you less than losing to a weak one. This is what makes a Bradley-Terry model useful for skill measurement — a player who consistently performs well against strong competition ends up rated higher than one who racks up big margins against weaker fields.

What do the rating numbers mean?

The numbers represent relative strength on a continuous scale. As a rough guide: a player rated 1.55 (like Ken Jennings) is expected to win about 88% of games against average competition, while a player rated 0.50 wins about 58%. The tier labels (All-Time Great, Elite, Strong, Solid, Competitive, Field) provide quick context for where a player falls among rated players.

Why do players need at least 4 games to appear?

Players with fewer than 4 games don't have enough data for the model to produce a reliable rating. The 4-game threshold filters out one-and-done contestants where the rating would be heavily influenced by a single game's variance.

Which games count toward the ratings?

Regular season games plus championship-level tournaments: Tournament of Champions, Masters, Champions Wildcard, Second Chance, Invitational Tournament, College Championship, Teachers Tournament, and special events like the Greatest of All Time Tournament. Teen, celebrity, kids week, and power-player tournaments are excluded — they use different rules or contestant populations. A player who only appeared in excluded tournaments won't have a rating.

What is a Coryat score?

The Coryat score measures a player's raw knowledge independent of wagering. It counts the value of every clue a player answers correctly (and subtracts incorrect responses) while ignoring Daily Double wagers and Final Jeopardy. This isolates fundamental buzzer-and-knowledge ability from strategic wagering decisions.

BRING IT Forecaster

Ken vs. James vs. Matt? Your favorite champion against your college roommate? Settle the debate. Try it →

How does the forecaster work?

The forecaster takes the BUTTREY ratings of three players and returns each player's probability of winning that matchup. The rating model is calibrated against tournament matchups, where the strongest players actually face each other — the best available ground truth for how top players perform head-to-head.

How accurate is it?

The model correctly predicts the winner about 46% of the time on held-out test data. In a 3-player game, random chance would be 33%, so the model provides a meaningful edge. Jeopardy outcomes have inherent variance — even the strongest players lose regularly — so no model can achieve very high accuracy.

Can I share a matchup with someone?

Yes. After generating a prediction, click the "Copy Shareable Link" button below the results. This creates a URL with the three players encoded in it, so anyone who opens the link will see the same matchup and prediction.

Excitement Index

We watched 5,000+ games so you don't have to. Here are the ones that made us yell at the TV. View rankings →

What is the Excitement Index?

The Excitement Index is a 0–10 score for how exciting a Jeopardy game is to watch. It's a weighted average of 10 measurable game-content signals, each rating one aspect of the game's drama:

  • Round Tempo — how fast money moved during Double Jeopardy
  • Final Stakes — total board money entering Final Jeopardy
  • DD Wagering — how much Daily Doubles changed the game
  • FJ Cover Tightness — how close 2nd place was to catching the leader
  • Hot Start — money won in the opening 25% of the game
  • Buzzer Dominance — was the buzzer balanced or one-sided?
  • Stakes Context — tournament round or champion's streak boost
  • Comeback Depth — biggest deficit the winner had to overcome
  • FJ Swing — did Final Jeopardy actually flip the leader?
  • Run-of-Correct — longest streak of consecutive clues won by one player

The default weights aren't pulled from thin air. They're calibrated against actual community sentiment — see "How is this calibrated?" below.

How are the components weighted?

The default slider positions reflect optimized weights from the calibration process below, not equal weighting. The Custom Rankings page lets you override any slider 0–100 to build your own version.

How is the Excitement Index calibrated?

We test it against actual r/Jeopardy community reaction. Here's the methodology:

  1. We pull every available Reddit thread about each Jeopardy episode (~2,400 modern-era games with substantive discussion).
  2. For each game, we feed the thread + top 50 comments to an AI (Claude Sonnet 4.6) along with a strict rubric:
    • 10 = "Multiple GOTY mentions, immediate consensus this was historic"
    • 7-8 = "Strong positive sentiment, real engagement"
    • 5-6 = "Moderate engagement, mixed reception"
    • 3-4 = "Subdued thread, mostly procedural"
    • 1-2 = "Dominated by 'boring'/'snoozer'/'blowout' complaints"
  3. The rubric explicitly debiases against star power: a close-fought game between unknowns can score 9-10, while a blowout featuring a famous player can score 3-4. The signal we want is sentiment density and content, not comment volume.
  4. We then fit our 10-component formula's slider weights to maximize correlation with that human-graded sentiment. The optimizer tries every plausible weighting until it finds the one that best reproduces how real Jeopardy fans actually rated the games.
  5. We hold out a chunk of games during training to verify the formula generalizes. On held-out games (games the optimizer never saw), our correlation with community sentiment is 0.61 (Spearman ρ) — meaning ~37% of the variance in Reddit-community game-rating is captured by the 10 game-content primitives.

This isn't perfect — community reactions still have some star bias the rubric can't fully strip out — but it's a much more rigorous foundation than "we picked weights that felt right."

Why these specific 10 primitives?

The 10 weren't chosen up-front. We started with a smaller formula (6 primitives, equal weights, calibrated against a simpler keyword counter on Reddit comments) and iteratively asked: where does the formula disagree with what the community actually thinks? Then we tested whether adding (or dropping) a primitive would close that gap.

Twelve iterations of this loop produced v14 — 5 new primitives that demonstrably help (FJ Cover Tightness, Hot Start, Buzzer Dominance, Comeback Depth, Run-of-Correct), and 2 that turned out to be redundant under the cleaner calibration target and were dropped.

The 11th and 12th primitives the optimizer tried? It gave them weight ≈0 — meaning the existing 10 already capture the available signal. That convergence is part of why we're confident 10 is the right cap.

General

How often is the data updated?

All data — win probability charts, Daily Double analysis, player ratings, and excitement scores — is updated automatically overnight after new games air. There is no manual intervention required.

Who built this site?

This site is a project of Colin Davy, a data scientist in Chicago and Jeopardy champion. You can read more on the About page.

Is there a newsletter?

Yes — The Nerdy Jeopardy Newsletter. Deep dives on Jeopardy theory and the analytics behind this site, written by a former Jeopardy contestant. Occasional issues, no spam. Subscribe on the Articles page or directly on Substack.

Where can I read articles?

Board Control articles are published on Substack first, then automatically mirrored to the Articles page here within 24 hours. Read either place — same full content.