From Forecasts to a Portfolio
In our companion blog post, we built an AI forecasting pipeline that researches Kalshi prediction markets and produces probability estimates for each one. The natural follow-up question: are the forecasts actually accurate?
Prediction markets give us a hard benchmark. If the AI's probability estimates are better than the crowd's, a portfolio that trades on them should make money — buying YES when the forecast is above the market price, and NO when it's below. If the portfolio loses money, the forecaster isn't adding value. If it makes money, that's strong evidence the AI is producing genuinely useful probability estimates.
This case study describes how we construct and track that portfolio.
Disclaimer: This is not investment advice. The portfolio described here uses simulated allocations for the purpose of benchmarking forecast accuracy. No real trades are placed.
How the Trader Works
The trader notebook takes the forecaster's output CSV and builds a simulated portfolio in four steps:
1. Fetch order books
For each forecasted market, we pull the live Kalshi order book — not just the midpoint price, but the full set of resting limit orders on both sides. This tells us what prices are actually available and how many shares we could buy at each price level.
2. Filter for edge
Not every disagreement between our forecast and the market is worth trading. We filter for markets where:
- The edge (difference between our forecast and the market price) is at least 2%
- The expected return, annualized, exceeds our threshold
- The market resolves soon enough that the return on capital is attractive — shorter-duration markets mean faster compounding
This keeps us out of marginal bets where the forecaster barely disagrees with the market.
3. Set target positions
We divide the $100,000 portfolio equally among all qualifying markets. For each position, we walk the order book, accumulating shares only at prices within our edge filters. If we're buying YES, we take asks up to our maximum price; if we're buying NO, we take asks on the other side.
4. Record what fills
Not all target positions fill completely. Thin order books, wide spreads, and prices outside our filters all reduce fill rates. We record the shares bought, average price paid, and fill percentage for each position. This is a realistic simulation — we're limited by actual available liquidity, not by wishful thinking about what we could buy.
The Portfolio: February 26, 2026
From 153 forecasted markets, the trader identified 24 positions with sufficient edge. Here's what the simulated portfolio looks like:
| Metric | Value |
|---|---|
| Markets analyzed | 153 |
| Positions taken | 24 |
| Target portfolio | $100,000 |
| Capital deployed | ~$50,000 |
| Fully filled positions | 8 of 24 |
| Average fill rate | ~43% |
The portfolio is only about half deployed — a consequence of taking real order book liquidity seriously. Many markets have thin books where you simply can't get the size you want at acceptable prices. We'd rather be half-invested at good prices than fully invested at bad ones.
Here are all 24 positions, sorted by amount invested:
| Question | Pos | Mkt | Fcst | Edge | Invested | Fill |
|---|---|---|---|---|---|---|
| Texas Democratic Senate nominee? [Jasmine Crockett] | YES | 30% | 47% | +17 | $4,167 | 100% |
| Texas Democratic Senate nominee? [James Talarico] | NO | 69% | 63% | +6 | $4,167 | 100% |
| 2026 Texas Senate matchup? [Talarico vs. Paxton] | NO | 62% | 45% | +17 | $4,167 | 100% |
| Which companies will have a top-ranked AI model this year? [OpenAI] | YES | 59% | 85% | +26 | $4,167 | 100% |
| World leaders out in 2026? [Ali Khamenei] | NO | 56% | 42% | +14 | $4,167 | 100% |
| Florida Republican Governor nominee? [James Fishback] | NO | 16% | 6% | +10 | $4,167 | 100% |
| Will the U.S. confirm that aliens exist before 2027? | NO | 23% | 8% | +15 | $4,167 | 100% |
| Will the US take control of any part of Greenland? [Before January 2027] | NO | 41% | 10% | +31 | $4,167 | 100% |
| 2026 Texas Senate matchup? [Crockett vs. Paxton] | YES | 25% | 32% | +7 | $3,153 | 76% |
| Ali Khamenei out as Supreme Leader? [Before July 1, 2026] | NO | 39% | 32% | +7 | $3,148 | 76% |
| When will DHS receive full-year funding? [Before Mar 20, 2026] | NO | 32% | 23% | +9 | $3,148 | 76% |
| California Governor winner? [Eric Swalwell] | NO | 50% | 42% | +8 | $1,680 | 40% |
| When will Warsh's Fed Chair nomination be received by the Senate? | NO | 62% | 57% | +5 | $1,248 | 30% |
| Keir Starmer Out? [Before Jul 1, 2026] | NO | 50% | 47% | +3 | $1,103 | 26% |
| Number of rate cuts in 2026? [Exactly 0 cuts] | YES | 13% | 19% | +6 | $819 | 20% |
| Will marijuana be rescheduled? [Before 2027] | NO | 56% | 45% | +11 | $696 | 17% |
| Which companies will have a top-ranked AI model this year? [xAI] | YES | 50% | 78% | +28 | $576 | 14% |
| Who will run for the 2028 Democratic presidential nomination? [Kamala Harris] | YES | 60% | 78% | +18 | $415 | 10% |
| Will marijuana be rescheduled? [Before July 2026] | NO | 25% | 20% | +5 | $261 | 6% |
| Gas prices in the US in Feb 2026? [Above 3.00] | YES | 9% | 15% | +6 | $165 | 4% |
| Who will leave the Trump administration in 2026? [Kristi Noem] | YES | 50% | 62% | +12 | $160 | 4% |
| CPI year-over-year in May 2026? [Exactly 2.8%] | NO | 26% | 16% | +10 | $90 | 2% |
| How many launches will SpaceX have in February 2026? [Above 12] | YES | 6% | 45% | +39 | $60 | 1% |
| Kristi Noem out as DHS Secretary? [Before Jul 1, 2026] | YES | 29% | 33% | +4 | $1 | 0% |
The portfolio spans politics (Texas Senate, Florida Governor, California Governor), geopolitics (Greenland, Khamenei, Starmer), economics (rate cuts, CPI, gas prices, marijuana rescheduling), AI (top-ranked models), and policy (DHS funding, Warsh nomination). This diversity is a feature — it means our accuracy isn't dependent on getting one domain right.
The eight fully filled positions ($4,167 each) are the backbone of the portfolio, accounting for two-thirds of deployed capital. These are markets with enough liquidity to absorb the full allocation. The remaining sixteen positions fill partially, from 76% down to essentially zero — the order books just didn't have enough shares at acceptable prices.
Some notable positions:
- Greenland NO at $4,167 — The AI thinks there's only a 10% chance the US takes control of any part of Greenland before January 2027, versus the market's 41%. That's our largest edge at +31 points.
- Aliens NO at $4,167 — Market says 23% chance the US confirms aliens exist before 2027; our forecast says 8%. Fifteen points of edge on a question where the base rate for government alien confirmations is, historically, zero.
- OpenAI top-ranked model YES at $4,167 — The AI thinks there's an 85% chance OpenAI will have a #1 ranked AI model this year, versus the market's 59%. The research cites OpenAI's track record and upcoming model releases.
- **SpaceX >12 launches YES at 60 deployed despite a massive forecast disagreement.
What Success Looks Like
A 30% annualized return would be a remarkable achievement — that's about 2.2% per month. We'll run the forecaster weekly, update positions, and track the portfolio's mark-to-market value over time.
This is an empirical question. We don't know yet whether the AI forecaster adds enough accuracy over the crowd to generate returns. But the structure is set up to tell us clearly: the portfolio goes up if our forecasts are better than the market, and down if they're worse. No ambiguity, no hand-waving — just a P&L that reflects forecasting accuracy.
We hope the signal is clear within a month or two. If the portfolio appreciates, that's evidence the forecaster is doing something useful. If it doesn't, we'll know we need to improve the pipeline.
Follow Along
- Forecaster blog post — How we produce the forecasts
- Full forecast data — Research, rationales, and forecasts for all 153 markets
- Kalshi Forecaster notebook — Run the forecaster yourself
- Kalshi Trader notebook — Run the trader yourself
We'll update this page as we track the portfolio over the coming weeks. We hope you enjoy following along!