Trench Monthly · Issue #1 · April 21 – May 12

The Iran Tape: 28 closed losses, two patterns, one config change.

Three weeks of paper trading, four variants, 46 closed trades. The bot read Iranian state media in Farsi every ten minutes and came to a conclusion the Manifold crowd has so far disagreed with by 34 percentage points.

2026-05-12 · ~7 min read · 4 paper variants compared · ~$110 net P&L across $4,000 paper bankroll

What Trench did

Between April 21 and May 12, four paper-trading variants of Trench (baseline, high-conviction, wide-net, and the new TrenchV2) each ran the bot's full intelligence pipeline every ten minutes and made their own decisions on the same signals. Same news, same Claude analysis, different gates.

46
Closed trades
15
Wins
28
Losses
35%
Win rate (decisive)

Across the four variants combined: +$285 in wins, −$218 in losses, net ~$67 on $4,000 of paper bankroll. That's barely above zero. We're not pretending otherwise.

The interesting part isn't the P&L. The interesting part is which kinds of trades worked, which kinds didn't, and what that tells us about how to size and time the next 28.

The single biggest pattern: the bot was right on the Iran deal

By a wide margin, the most-traded thesis this period was NO on a US-Iran nuclear deal within the year. Three of the four variants concentrated heavily here. As of today, the bot's read on Iran escalation is 45 percentage points on a 0-1 scale; the volume-weighted Manifold consensus is 79 percentage points. A 34-point gap.

"Iran is executing a successful low-cost Hormuz sovereignty play that raises oil prices without triggering military response, while nuclear talks drift past near-term deadlines. The scenario is managed tension, not escalation." — Trench signal 466201ee, 2026-05-12 12:25 UTC

The crowd thinks something dramatic happens. Trench thinks nothing dramatic happens. One of us is wrong. We'll know which over the next 90 days as the June, July, August, and September contracts settle. The bot's per-contract probabilities are all auditable at /audit?trade_id= for any individual trade in the tape.

What went wrong, specifically

Of the 28 closed losses, our lesson classifier (lexical patterns + Haiku for the long tail) labels them like this:

Three example loss cards from the tape, all linked to their full audit and chain anchor:

Thesis invalidated · −$10.53
Will the US agree to a new Iranian nuclear deal before August? (Aug-26)
"This longer-dated deal market (August) has moved +0.045 against the NO position, and the recent signal explicitly notes 'my prior was 32%, market is now 58% on full-year, I'm nudging up' — indicating a material revision upward on deal probability from the original thesis author."
High Conviction variant · NO at 0.36 → exit 0.41 · 23.5h held · 2026-05-09
Thesis invalidated · −$2.43
Will the US agree to a new Iranian nuclear deal this year? (Jan-29)
"Position is underwater (-2.43) after only 2.2h with YES price moving against the original thesis."
Baseline variant · NO at 0.80 → exit 0.81 · 2.2h held · 2026-05-09
Thesis invalidated · −$31.64
Will the US agree to a new Iranian nuclear deal before June?
"Original thesis was NO on near-term deal (June), but latest signal shows deal momentum building faster than priced. Position is -53.7% underwater after 14.9 hours; thesis has materially deteriorated. Closing for stop-loss before further drift."
Baseline variant · NO at 0.27 → exit 0.41 · 14.9h held · 2026-05-07

Every loss in the tape is one click away on the Wall of Transparency, filterable by lesson type and variant.

What we changed because of this

Two structural changes shipped this period:

1. Backtest infrastructure with walk-forward validation

We built a replay engine that walks every historical signal forward under counterfactual parameter sets, then validates against a held-out time window. First run found a wide-bracket config that looked great on in-sample data (+42% ROI). Walk-forward killed it: zero of the top-10 in-sample winners produced any closed trades on the test fold. Open-at-end selection bias. The config we eventually deployed (TrenchV2) was picked by bootstrap-ranking under a close-rate constraint, not by raw ROI. Full write-up at /backtest.

2. TrenchV2 — first config picked from data, not intuition

Result: TP=30%, SL=30%, confidence=0.70, edge=0.03, size=$30. Symmetric brackets (the asymmetric 20%/10% bracket the live bot was using was responsible for most of the bracket-overshoot losses above). Looser confidence floor (the sharpest band of confidence in the data was 0.70-0.74, not the 0.74+ we'd been using). Bootstrap P(profitable) on this config: 57.3%. The same metric on the existing baseline: 0.2%. A 290× lift in our self-assessed probability of breaking even. Falsifiable in 4 weeks per tasks/trenchv2-hypothesis.md.

What's on the table for next month

Coming next

The June Iran contracts settle June 30. Five of Trench's current open positions resolve on that date. Brier scores light up at that point — the prediction-market call becomes an actual scored prediction, not a paper position. The Arena leaderboard populates for the first time.

The analyzer/decider architecture split. Four variants currently run the entire intelligence pipeline independently. That's 4× the Anthropic spend for the same analysis. Phase 1 (publisher) shipped May 12; Phase 2 (observer) shipped May 12. Phase 3 (consume mode on one variant) is queued.

Ontology expansion. We added 9 entities and 32 aliases this period (now at 231 entities, 280+ aliases). The candidates panel at /graph still has ~12 frequently-seen but unresolved phrases. Next pass: seed the remaining real entities, prune the noise (news outlet names, anniversary day labels) into a stoplist.

Where to verify any of this

Every claim in this post traces to a public surface:

If anything in this post doesn't match what those pages say, the pages are the source of truth. Tell us hello@trenchsignals.io and we'll fix the post.

Trench Monthly is a recurring write-up of what the bot did in the last 30 days. Issue #2 lands the second week of June, after the first batch of June Iran contracts settle. To get it in your inbox, subscribe at trenchsignals.io.