How We Fit the CFB JND Curve
A 19th-century framework for telling things apart
In the 1860s, the German psychophysicists Gustav Fechner and Ernst Weber asked a quiet, foundational question: how different must two stimuli be before a person can reliably tell them apart? Lift two weights with your eyes closed; if the difference between them is small enough, you can't say which is heavier. Their answer became the just-noticeable difference, or JND — a threshold below which the nervous system, under controlled conditions, fails to discriminate.
Fast-forward to college football. Every Saturday afternoon, the AP poll asks the same question of two teams: which one is better? If the ranking system can reliably tell them apart, the higher-ranked team should win meaningfully more than half the time. If it can't, the games at small rank-deltas should look like coin flips. That's an empirical question with thirty years of data behind it.
The piece in Issue #31 found the answer: the AP poll's JND is a rank delta of about 13. Below that, the ranking is doing only slightly more than guessing. Above that, it's doing real work.
The JND threshold for “works well” isn't arbitrary. It comes from signal detection theory: a discriminability of d′ = 1, which corresponds to a 75% correct rate in a two-alternative forced-choice task. That's the classical bar for “just noticeable.” Below 75%, the human (or the poll) is responding more to noise than to signal. Above it, real information is being transmitted.
Logistic regression on 1,597 ranked matchups
For every FBS game from 1995 through 2024 in which both teams appeared in that week's AP Top 25, we recorded two numbers per game: the rank delta Δ (the absolute difference between the two teams' AP ranks) and a binary outcome (1 if the higher-ranked team won, 0 if it lost). That gave us 1,597 rows. The natural model for a binary outcome that depends smoothly on a continuous predictor is logistic regression:
Two parameters: a is the intercept (baseline log-odds at Δ=0, where teams are tied in rank), and b is the slope (how steeply the win probability rises with each additional rank of separation). Fitting this to the 1,597 games via maximum-likelihood gradient descent yields:
To find the JND, we solve for the Δ at which the predicted win probability equals 0.75 (the d′=1 threshold). The algebra gives a clean closed form:
For comparison, an ideal psychometric function — one in which the ranking system perfectly captured team quality — would have a much steeper slope. If the JND lived at Δ=4, the slope would be b = ln(3)/4 ≈ 0.275, four times the empirical fit. The empirical b is about 23% of the theoretical ideal. That ratio is the magnitude of the gap between “what rankings ought to do” and “what rankings actually do.”
Worked example: #4 Alabama at #2 Georgia, September 28, 2024
Two teams ranked two slots apart in the AP poll. Δ = 2.
The key insight: the AP poll's slope at Δ=2 is so shallow that a 60% prediction is doing barely more than chance. That's not a failure of the poll; it's the poll honestly reporting low confidence. The piece's JND framing makes that low confidence quantitative.
Twenty lines that fit the curve, any sport, any era
Two scripts produce the analysis. The first builds the dataset from the College Football Data API; the second fits the logistic regression and emits the SVG paths used in the article's figures. The methodological core of the second script — the gradient-descent fit and the JND formula — is below:
import math def fit_logistic(xs, ys, lr=0.003, iters=5000): """Maximum-likelihood logistic regression: P = 1/(1+exp(-(a + b*x)))""" a, b = 0.0, 0.0 n = len(xs) for _ in range(iters): ga = gb = 0.0 for x, y in zip(xs, ys): p = 1.0 / (1.0 + math.exp(-(a + b * x))) ga += (y - p) gb += (y - p) * x a += lr * ga / n b += lr * gb / n return a, b def jnd_from_params(a, b, target=0.75): """Solve P=target for the rank delta at which we cross the JND threshold.""" return (math.log(target / (1 - target)) - a) / b # Usage: xs is a list of rank deltas (one per game), ys is a list of 0/1 # for whether the higher-ranked team won. After the fit: # a, b = fit_logistic(xs, ys) # print(f"JND at delta = {jnd_from_params(a, b):.1f}") # For our 1,597-game dataset: a=0.26, b=0.063, JND=13.2
To reproduce Issue #31's numbers: get a free API key from collegefootballdata.com, set CFBD_KEY in your environment, then run:
CFBD_KEY=<your_key> python3 scripts/build_cfb_jnd_dataset.py python3 scripts/fit_cfb_jnd_curves.py
The first script fetches games and AP rankings for 30 seasons, joins them on (year, week, team), filters to ranked-vs-ranked matchups, and writes the dataset to scripts/data/cfb-ranked-matchups.csv. The second loads the CSV, fits the logistic, and prints the JND plus a per-delta breakdown.
To run it on a different ranking system: replace the CFBD API call with whatever data source has your ranking + outcome pairs. The fit_logistic function doesn't care what the predictor is — it could be NBA Vegas line difference, NHL Elo gap, soccer Pythagorean expectation, anything. The JND framework generalizes to any ordinal-rank-vs-binary-outcome question.
Caveats and known limitations
- Only AP poll rankings, not Coaches or CFP. The CFP era introduced its own committee ranking that may have different JND properties; we haven't fit it separately yet.
- No home-field adjustment. Home teams win ~58% of CFB games on average; that means our predictions are slightly miscalibrated when the lower-ranked team is at home.
- No Vegas line incorporated. The line is a far better predictor than the poll, and a JND analysis on point-spread outcomes would yield a tighter curve.
- Logistic regression assumes the relationship is monotonic in Δ. It mostly is, but the per-delta scatter shows a notable Δ=4 anomaly (63.5% across 137 games, lower than Δ=3 or Δ=5). The logistic fit smooths over that bump.
- FBS only. FCS, Division II, Division III matchups are excluded.
Read the article that uses this analysis: The Just-Noticeable Difference in College Football Rankings Is 13 →