trenchsignals.iobeta📄 PAPER TRADING● live strategy
Autonomous conflict intelligence
Loading…-
📄 Paper Trading Demo — simulated positions, real signal data
The live tape
Trench is paper-trading geopolitical prediction markets in public.
Every position below has a public entry timestamp, exit reason, and paper P&L computed by open-source code (trench-core).
Four strategy variants run in parallel against the same intel pipeline. Real-money trading is currently paused; see
about
or methodology.
Refreshes every 10 minutes.
●Loading…
Last 24h:
Strategy tournament · paper variants
Same intelligence, different decision policies. Each variant runs in isolation;
P&L compares head-to-head. Leader gets the highlight.
AI vs Crowd · per-theater
Trench’s theater probability vs the volume-weighted Manifold Markets crowd consensus on the same near-term questions.
Positive Δ means Trench is more bullish on escalation than the crowd; negative Δ means more dovish.
Live whale flow · Polymarket
Every $5K+ trade on the markets we're tracking, captured via WebSocket.
Watching for the first $5K+ trade…
What every cycle did · last 24h
One log line per entry decision. Only traded means a position was placed.
Total P&L
-
-
Win Rate
-
-
Open Positions
-
-
Best / Worst
-
-
Avg Return
-
-
Last Signal
-
-
📊 Open Positions0
⏳ Exits triggered — awaiting Claude review
Market
Side
Entry
Now
P&L
Held
No open positions
🗂 Recent Trades0
Market
Side
Price
P&L
Hold
Status
Conf
No trade activity yet
🎯 Market Assessments
Claude vs market - latest signal
-
Market
Theater
Claude ¢
Market ¢ (live)
Spread
Edge ↓
Δ Cycle
No assessment data yet - waiting for next signal cycle
1 / 1
⚡ Live Assessment
-
Waiting for first signal…
📜 Signal History0
TimeSignalEsc. ProbConf
No signals yet
1 / 1
Claude's escalation probability across all signal cycles. Dot size = confidence; color = direction (amber = ESCALATE, blue = DEESCALATE, grey = SKIP); ring = trade placed. Hover dots to inspect.
📈 Escalation Probability History
last 20 signal cycles — hover to inspect
⚡ Recent Whale TradesPolymarket - large orders in last 24h
Loading…
Tracked Markets
All geo markets currently assessed by Claude — live probability estimate vs market price, sorted by absolute edge. Updates with each signal cycle.
Waiting for signal data…
Market
Claude
Market
Edge ↓
Signal
Claude History
-
Loading news feed…
Strategy Calibration
Probability accuracy, edge capture, P&L attribution, and threshold sensitivity
Cycle Outcomes (last 24h)
Every entry-decision tick exits via exactly one path. Where are cycles dying?
Only traded means a position was placed.
Loading…
Tuning Recommendations · auto-triggered
Strategy changes the data has earned. Each recommendation specifies a fire condition (n≥30,
ROI gap, etc.) — they appear automatically as the data crosses the threshold. No manual revisit.
Strategy Decisions Log
Append-only record of changes driven by what the calibration data is telling us. Honest about
findings, including the ones that say "your strategy is leaking money here."
2026-05-05
Raised confidence threshold 0.72 → 0.74
Backtest showed the 0.76–0.80 bucket winning 75% with +42% ROI vs the 0.72–0.76 bucket bleeding −7%.
Sample is small (n=13 signal trades) so bumped to 0.74 rather than 0.76 — provisional change,
revisit at n=30+. Both real bot and shadow updated.
2026-05-05
Fixed orphan flow leak (−$2.86 across 7 trades)
Orphan assessor (runs 2 min after restart on positions with no thesis) was biased toward EXIT when uncertain —
Claude was being told "if there's no clear reason to hold, exit", which on a freshly-restored
position with no original context almost always meant dumping. Rewrote system prompt: default HOLD with a
fresh thesis, only EXIT if the position is >15% past stop-loss with no recovery, the market has resolved,
or price is at terminal value. Also now captures Claude's actual reasoning in exit_reason
instead of a bare orphan_assessment_exit label, so the diary can render real lessons.
2026-05-05
Honesty rails on every public surface
Loss cards in the diary, weekly digest, and tweet templates now lead with
"Why I was wrong" — Claude's own exit reasoning, plus a structural failure-mode label
(high-confidence miss / thin signal / late entry / thesis invalidated). Weekly digest puts
"What I got wrong this week" above "What worked." The point: anyone can claim wins; the moat is
owning specific, classifiable losses.
2026-05-05
Risk circuit-breaker: --max-session-loss
New flag halts new entries (existing position management continues) when cumulative session
P&L drops below a configurable loss. Latched until restart. Real bot: −$5 cap.
Shadow: −$100. Defensive control before any future move from paper to real-money sizing.
2026-05-05
Cycle-outcome instrumentation
Every entry-decision tick now emits exactly one structured line: Cycle outcome:
<category>. Categories cover every early-return path
(too_few_new_items, source_diversity_gate, analyzer_timeout,
signal_skip, session_loss_cap, confidence_too_low,
cap_reached, tier_gate_75/80, no_candidates_above_min_edge,
traded). Tomorrow we'll have a real histogram of which gate is killing
the 83% of cycles that don't produce trades.
2026-05-05
Theater concentration audit
11 of 13 signal trades have been on Iran. Audited the market universe — Kalshi
currently exposes 81 Taiwan markets, 7 Iran, 3 Ukraine/Russia.
Trade-side filters and theater tagging both pass through Taiwan correctly,
so the gap is upstream: the news + signal pipeline is Iran-heavy, so Claude
rarely generates directional ESCALATE/DEESCALATE signals for Taiwan. Strategy
work to fix this needs a deliberate prompt change + data, not an ad-hoc tune. Logging
the gap and watching whether Claude-driven theater signals surface organically as
Taiwan news evolves.
Still on the radar:
TP/SL bracket sizes
(deferred — brackets fire too rarely; Claude exits dominate, need ~30 more trades),
deliberate Taiwan expansion
(open question: tune analyzer prompt to surface non-Iran theaters more aggressively).
Loading calibration data…
Equity Curve
Cumulative P&L per closed trade. Signal-only trades (confidence > 0) shown in green; orphan/restored trades in grey.
Escalation Probability Timeline
Claude's escalation probability (amber) vs baseline (dashed blue) across all signal cycles. Dot size = confidence; dot color = direction (amber = ESCALATE, blue = DEESCALATE, grey = SKIP). Ring = trade placed.
Market Probability View
Scatter: Claude's estimated YES probability vs market price for all evaluations.
Points above the diagonal = Claude sees positive edge. Color = exchange; ring = trade placed.
Select a market below to see its probability history over time.
Confidence Threshold Sensitivity
›
How P&L changes if the confidence cutoff were raised or lowered — using only trades we actually took.
Min Confidence
Trades
Win Rate
Total P&L
ROI
P&L by Theater
›
Theater
Trades
Win Rate
Total P&L
ROI
P&L by Signal Direction
›
Direction
Trades
Win Rate
Total P&L
ROI
Exit Reason Breakdown
›
Exit Type
Count
Avg P&L
Total P&L
P&L by Confidence Band
›
Confidence Range
Trades
Win Rate
Total P&L
ROI
P&L by Confluence Band · graph-grounded
›
Sliced by the number of distinct intel source-types touching the trade's primary entities at trade time. Tests whether multi-source corroboration actually predicts outcomes.
Confluence Band
Trades
Win Rate
Total P&L
ROI
P&L by Primary Entity · graph-grounded
›
Replaces theater string-matching with the ontology-tagged primary entity for each trade.
Primary Entity
Trades
Win Rate
Total P&L
ROI
Probability Calibration Curve
›
Predicted probability vs actual resolution rate. A perfectly calibrated model sits on the diagonal.
Prob Bucket
N Evaluated
Predicted Mid
Actual Win Rate
Calibration Error
Live Configuration
Current bot settings
Strategy: Global conflict — Middle East · Ukraine · Taiwan · N.Korea
Exchange: Kalshi (US-regulated)
Signal model: claude-sonnet-4-6
Review model: claude-haiku-4-5
Poll interval: 10 minutes
Position management: Continuous (every 30s)
Trades per cycle: One — highest-edge candidate only
Min confidence: 0.72
Take-profit: +20¢ price move OR 50% return on entry cost (dynamic — Claude can extend)
Stop-loss: −10¢ price move (dynamic — Claude can override)
Min entry price: 5¢ per contract
Escalation decay: Half-life ≈ 6.4 h, mean-reverts to 25% between signals
Market Regime
Auto-classified from rolling escalation probability, urgency distribution, and vol regime across recent signals
—
Avg Escalation
—
Esc Trend
—
High Urgency
—
Vol Regime
—
Edge Distribution
Claude probability − market price across all scored markets in signal history
Mean Edge
—
Median Edge
—
+Edge Markets
—
Samples
—
Equity Curve & Drawdown
Cumulative P&L across closed trades — red shading = drawdown from peak
Total P&L
—
Max Drawdown
—
Win Rate
—
Closed Trades
—
Kelly Position Sizing
Optimal bet per market based on edge vs market price. Showing ½ Kelly (conservative). $1,000 paper bankroll.
Market
Side
Claude / Mkt
Edge
Full Kelly
½ Kelly $
vs $50 base
No markets with meaningful positive edge in current signal
YES: Kelly = (claude − market) ÷ (1 − market) · NO: Kelly = (market − claude) ÷ market · Capped at 25% · ½ Kelly applied for safety
Knowledge graph
The graph now has its own home.
We promoted the interactive ontology explorer to a standalone page so it can use the full viewport and be linked / shared on its own URL. Same Cytoscape view, same data, same click-to-explore, just bigger.
We promoted the platform write-up to its own URL so it has proper marketing-page typography,
social-share metadata, and a more readable layout for prose. Same content, more depth.