Trench Monthly · The Iran Tape

What Trench did

Between April 21 and May 12, four paper-trading variants of Trench (baseline, high-conviction, wide-net, and the new TrenchV2) each ran the bot's full intelligence pipeline every ten minutes and made their own decisions on the same signals. Same news, same Claude analysis, different gates.

Closed trades (43 decisive + 3 scratch)

Wins

Losses

35%

Win rate (15 of 43 decisive)

Across the four variants combined: +$285 in wins, −$218 in losses, net ~$67 on $4,000 of paper bankroll. That's barely above zero. We're not pretending otherwise.

The interesting part isn't the P&L. The interesting part is which kinds of trades worked, which kinds didn't, and what that tells us about how to size and time the next 28.

The single biggest pattern: the bot was right on the Iran deal

By a wide margin, the most-traded thesis this period was NO on a US-Iran nuclear deal within the year. Three of the four variants concentrated heavily here. As of today, the bot's read on Iran escalation is 45 percentage points on a 0-1 scale; the volume-weighted Manifold consensus is 79 percentage points. A 34-point gap.

"Iran is executing a successful low-cost Hormuz sovereignty play that raises oil prices without triggering military response, while nuclear talks drift past near-term deadlines. The scenario is managed tension, not escalation." — Trench signal 466201ee, 2026-05-12 12:25 UTC

The crowd thinks something dramatic happens. Trench thinks nothing dramatic happens. One of us is wrong. We'll know which over the next 90 days as the June, July, August, and September contracts settle. The bot's per-contract probabilities are all auditable at /audit?trade_id= for any individual trade in the tape.

What went wrong, specifically

Of the 28 closed losses, our lesson classifier (lexical patterns + Haiku for the long tail) labels them like this:

Thesis invalidated (19 losses, −$151): the market moved against the bot after a new headline contradicted the entry thesis. The bot recognized this and exited, often cleanly. These are the cheapest losses to take — small, fast, well-reasoned exits.
Bracket overshoot (4 losses, −$58): the stop-loss fired on routine volatility, not on a thesis change. These are the worst losses: the bot exited a position that was probably going to recover. They cluster on the tightest-bracket variant (Baseline, SL=10%).
Mispriced entry (4 losses, −$7): the bot admitted in the exit reasoning that it shouldn't have entered. Lottery-ticket markets, expiry-too-near positions. The amount is small because these were caught fast.
Wrong side of consensus (1 loss, −$2): whale flow pointed one way, Trench went the other, the whales were right. We weight whale signals as one input among many; this loss says we should weight them slightly higher.

Three example loss cards from the tape, all linked to their full audit and chain anchor:

Thesis invalidated · −$10.53

Will the US agree to a new Iranian nuclear deal before August? (Aug-26)

"This longer-dated deal market (August) has moved +0.045 against the NO position, and the recent signal explicitly notes 'my prior was 32%, market is now 58% on full-year, I'm nudging up' — indicating a material revision upward on deal probability from the original thesis author."

High Conviction variant · NO at 0.36 → exit 0.41 · 23.5h held · 2026-05-09

Thesis invalidated · −$2.43

Will the US agree to a new Iranian nuclear deal this year? (Jan-29)

"Position is underwater (-2.43) after only 2.2h with YES price moving against the original thesis."

Baseline variant · NO at 0.80 → exit 0.81 · 2.2h held · 2026-05-09

Thesis invalidated · −$31.64

Will the US agree to a new Iranian nuclear deal before June?

"Original thesis was NO on near-term deal (June), but latest signal shows deal momentum building faster than priced. Position is -53.7% underwater after 14.9 hours; thesis has materially deteriorated. Closing for stop-loss before further drift."

Baseline variant · NO at 0.27 → exit 0.41 · 14.9h held · 2026-05-07

Every loss in the tape is one click away on the Wall of Transparency, filterable by lesson type and variant.

What we changed because of this

Two structural changes shipped this period:

1. Backtest infrastructure with walk-forward validation

We built a replay engine that walks every historical signal forward under counterfactual parameter sets, then validates against a held-out time window. First run found a wide-bracket config that looked great on in-sample data (+42% ROI). Walk-forward killed it: zero of the top-10 in-sample winners produced any closed trades on the test fold. Open-at-end selection bias. The config we eventually deployed (TrenchV2) was picked by bootstrap-ranking under a close-rate constraint, not by raw ROI. Full write-up at /backtest.

2. TrenchV2 — first config picked from data, not intuition

Result: TP=30%, SL=30%, confidence=0.70, edge=0.03, size=$30. Symmetric brackets (the asymmetric 20%/10% bracket the live bot was using was responsible for most of the bracket-overshoot losses above). Looser confidence floor (the sharpest band of confidence in the data was 0.70-0.74, not the 0.74+ we'd been using). Bootstrap P(profitable) on this config: 57.3%. The same metric on the existing baseline: 0.2%. A 290× lift in our self-assessed probability of breaking even. Falsifiable in 4 weeks per tasks/trenchv2-hypothesis.md.

What's on the table for next month

Coming next

The June Iran contracts settle June 30. Five of Trench's current open positions resolve on that date. Brier scores light up at that point — the prediction-market call becomes an actual scored prediction, not a paper position. The Arena leaderboard populates for the first time.

The analyzer/decider architecture split. Four variants currently run the entire intelligence pipeline independently. That's 4× the Anthropic spend for the same analysis. Phase 1 (publisher) shipped May 12; Phase 2 (observer) shipped May 12. Phase 3 (consume mode on one variant) is queued.

Ontology expansion. We added 9 entities and 32 aliases this period (now at 231 entities, 280+ aliases). The candidates panel at /graph still has ~12 frequently-seen but unresolved phrases. Next pass: seed the remaining real entities, prune the noise (news outlet names, anniversary day labels) into a stoplist.

Where to verify any of this

Every claim in this post traces to a public surface:

The 46 closed trades, win rate, P&L: /api/tournament + /v1/lessons/stats
The 34pp Trench-vs-crowd gap: /shadow-api/api/forecast-comparison
Every individual loss with reasoning: /lessons
Per-trade audit with inputs/processing/hash/result: /audit
Backtest methodology and live runs: /backtest
Hash chain pinning every signal's timestamp: /registry

If anything in this post doesn't match what those pages say, the pages are the source of truth. Tell us hello@trenchsignals.io and we'll fix the post.

Trench Monthly is a recurring write-up of what the bot did in the last 30 days. Issue #2 lands the second week of June, after the first batch of June Iran contracts settle. To get it in your inbox, subscribe at trenchsignals.io.