Research · pre-registered · re-runs weekly

Does the engine know anything the market doesn’t?

Only a handful of our markets have resolved, so Brier-vs-market skill is not yet measurable — saying otherwise would be marketing. But there is an honest interim test: every ~20 minutes the engine logs its probability next to the live market price. When the engine disagrees with the market by 10¢ or more, does the market subsequently move toward the engine? The methodology was frozen before the first run, the result publishes either way, and the study re-runs weekly as data accumulates.

Current verdict
loading the latest run…

Primary result — 72-hour horizon

Mean signed market move after a ≥10¢ divergence onset, in the direction of the engine. Positive = the market moved toward the engine’s number. The 95% CI is a ticker-clustered bootstrap; the permutation p flips episode directions per market.

horizonepisodesmarketsmean move95% CIperm-pshare > 0

Pre-specified secondary cuts

Listed before the first run; reported uncorrected; never promoted to a claim. With ~11 cuts, expect one nominal p<0.05 by chance.

cutnmean move95% CIperm-p

Spec v2 — the literature-grade upgrade (frozen 2026-06-10, runs alongside v1)

After a methodology review of the event-study literature, v2 adds matched-control abnormal convergence (subtracting the drift of comparable non-episode snapshots — the standard control for deadline grind and favorite-longshot effects), a persistence rule at onset, a 14-day refractory, and a mechanical strawman: the same pipeline run with a Rothschild-debiased lagged market price as the “forecaster.” A claim requires beating both zero and the strawman. v2 carries a hard power gate: under 150 episodes it reports descriptive only, and its thresholds cannot be loosened post hoc.

Technique backtest — would the literature’s fixes help? (resolution-free)

Candidate forecast transforms scored against the debiased market price 7 days later (a proxy scoring rule from the forecasting literature, usable while resolutions are scarce). Caveat by construction: market-anchored candidates are mechanically favored by a market-derived proxy — those rows are shown for transparency, not as findings. The clean test is hazard-decay vs holding a stale forecast (both equally market-blind); it activates automatically once deadline-stamped rows age past 7 days (~2026-06-14).

candidatemean proxy-Brier (lower = closer to the future market)

How to read this honestly

Frozen methodology (v1)