Four paper variants currently run in production. Each has a pre-registered hypothesis and a numeric kill threshold. When a variant fails its threshold, it's marked failed publicly — the failed-hypothesis log is part of the receipts. New variants on the roadmap will test structurally different strategies, not just parameter sweeps.
The variants below differ from Baseline on something other than confidence/size — different models, different source mixes, different exit logic. Their decisions will be statistically less correlated with Baseline's, which makes their consensus signal more meaningful. Each lists its kill condition before it ships.
Most "trading bot leaderboards" only show winners. A variant that runs for 6 months and loses money quietly disappears or gets re-parameterized until the new history looks good. Trench's counter to that: every variant declares its kill threshold before it accumulates the data that would fail it. When the threshold is breached, the failed variant stays listed with its failure note. Cherry-picking is structurally prevented because the criteria are pre-registered and computable from /api/receipts.
Read more in /methodology; raw data feed
at /api/variants.