Skip to main content

Archive · NHL Goal Predictor

How the models actually did, day by day

The dashboard publishes tonight's predictions. This page publishes every previous night's predictions and what actually happened. The point is honesty: when a model claims "60% chance to score" tonight, the only way to know whether 60% means anything is to check it against thousands of past 60% claims and count how often they came true.

What this archive measures

Every game day at 06:00 UTC, the results pipeline pulls each game's box score from the NHL API, identifies who actually scored, and writes a JSON file to data/results/{date}.json. The file records every scorer that night plus, for each model on the site, that model's top-10 picks and a boolean indicating whether each pick scored.

Two metrics matter most for understanding archive performance:

The table below loads the most recent ~15 days of archive data straight from the JSON files in this repository. The dates link to the raw result files; click through to see exactly which players each model picked and which of them scored.

Recent results

Date Games Scorers Meta top-10 xG top-10 Market top-10 Result file
Loading archive…
Advertisement

How to read a hit-rate number

A few things worth keeping in mind when scanning this archive:

One night is noise. Hockey is a high-variance scoring environment. A 22-shot game can produce zero goals; a 17-shot game can produce six. Any single date's hit rate is dominated by random scoring variance, not model quality. The archive is most useful when you read it as a moving window — averaged over the last 10–15 game days, the patterns become real.

Some nights are easier than others. A four-game NHL slate gives the models fewer chances to land top-10 picks across many matchups; a 12-game slate gives them more. Hit rates tend to cluster higher on big-slate nights for purely combinatorial reasons. The archive shows the games count alongside the hit rate so you can normalize for this.

Different models are designed for different conditions. The Market Odds model can only contribute on games where bookmakers have posted prices — typically more than half of nightly games but not all of them. The Lineup TOI model leans on recent ice time data, which is noisier early in a season than late. The Meta Ensemble inherits all of these dependencies. A blank or unusually low cell is often "the model couldn't run cleanly tonight" rather than "the model was wrong."

The honest target is consistent calibration, not maximal hit rate. A model that produces well-calibrated probabilities is more useful for decision-making than a model that occasionally lands huge nights but is overconfident on average. The validator's calibration drift alerts are a louder signal than any individual day's hit rate, which is why the daily commit log will sometimes show calibration tweaks (e.g. raising the hard top-probability ceiling) without any change to the underlying models.

Where the data lives

All result and prediction files are public in the project repository:

For full context on how each model produces its top-10, see the methodology page. For definitions of "hit rate," "calibration," and the other terms used here, see the glossary.

Last updated 2026-05-03. The table above re-fetches every page load; if you see "Loading archive…" indefinitely, the JSON endpoints are temporarily unreachable and the static text on this page is what's still authoritative.