Forecast
forecast takes a DataFrame of questions and produces calibrated forecasts for each row. It supports two modes:
- Binary: Forecasts the probability (0–100) of YES/NO questions like "Will X happen?"
- Numeric: Forecasts percentile estimates (p10–p90) for continuous quantities like "What will the price/value/count be?"
The approach is validated against FutureSearch's past-casting environment of 1500 hard forecasting questions and 15M research documents. See more at Automating Forecasting Questions and arXiv:2506.21558.
Examples
Binary forecast
from pandas import DataFrame
from futuresearch.ops import forecast
questions = DataFrame([
{
"question": "Will the US Federal Reserve cut rates by at least 25bp before July 1, 2027?",
"resolution_criteria": "Resolves YES if the Fed announces at least one rate cut of 25bp or more at any FOMC meeting between now and June 30, 2027.",
},
])
result = await forecast(input=questions, forecast_type="binary")
print(result.data[["question", "probability", "rationale"]])
The output DataFrame contains the original columns plus probability (int, 0–100) and rationale (str).
Numeric forecast
from pandas import DataFrame
from futuresearch.ops import forecast
questions = DataFrame([
{
"question": "What will the price of Brent crude oil be on December 31, 2026?",
"resolution_criteria": "The closing spot price of Brent crude oil (ICE) on Dec 31, 2026, in USD/barrel.",
"resolution_date": "2026-12-31",
},
])
result = await forecast(
input=questions,
forecast_type="numeric",
output_field="price",
units="USD per barrel",
)
print(result.data[["question", "price_p10", "price_p25", "price_p50", "price_p75", "price_p90"]])
The output DataFrame contains the original columns plus {output_field}_p10 through {output_field}_p90 (float), units (str), and rationale (str). Percentiles are monotonically non-decreasing.
Batch context
When all rows share common framing, pass it via context instead of repeating it in every row:
result = await forecast(
input=geopolitics_questions,
forecast_type="binary",
context="Focus on EU regulatory and diplomatic sources. Assume all questions resolve by end of 2027.",
)
Leave context empty when rows are self-contained—a well-specified question with resolution criteria needs no additional instruction.
Input columns
The input DataFrame should contain at minimum a question column. All columns are passed to the research agents and forecasters.
| Column | Required | Purpose |
|---|---|---|
question |
Yes | The question to forecast |
resolution_criteria |
Recommended | Exactly how the outcome is determined |
resolution_date |
Optional | When the question closes |
background |
Optional | Additional context the forecasters should know |
Column names are not enforced—research agents infer meaning from content. A column named scenario instead of question works fine.
Parameters
| Name | Type | Description |
|---|---|---|
input |
DataFrame | Rows to forecast, one question per row |
forecast_type |
"binary" | "numeric" |
Type of forecast to produce |
context |
str | None | Optional batch-level instructions that apply to every row |
output_field |
str | None | Name of the quantity being forecast (required for numeric, e.g. "price", "valuation") |
units |
str | None | Units for the forecast (required for numeric, e.g. "USD per barrel", "billions USD") |
session |
Session | Optional, auto-created if omitted |
Output
Binary (forecast_type="binary")
Two columns are added to each input row:
| Column | Type | Description |
|---|---|---|
probability |
int | 0–100, calibrated probability of YES resolution |
rationale |
str | Detailed reasoning with citations from web research |
Probabilities are clamped to [3, 97]—even near-certain outcomes retain residual uncertainty.
Numeric (forecast_type="numeric")
Seven columns are added to each input row:
| Column | Type | Description |
|---|---|---|
{output_field}_p10 |
float | 10th percentile estimate |
{output_field}_p25 |
float | 25th percentile estimate |
{output_field}_p50 |
float | 50th percentile (median) estimate |
{output_field}_p75 |
float | 75th percentile estimate |
{output_field}_p90 |
float | 90th percentile estimate |
units |
str | The units provided as parameter |
rationale |
str | Detailed reasoning with citations from web research |
Percentiles are monotonically non-decreasing: p10 ≤ p25 ≤ p50 ≤ p75 ≤ p90.
Performance
| Rows | Time | Cost |
|---|---|---|
| 1 | ~5 min | ~$0.60 |
| 5 | ~6 min | ~$3 |
| 20 | ~10 min | ~$12 |
Via MCP
MCP tool: futuresearch_forecast
| Parameter | Type | Description |
|---|---|---|
data |
list[object] | Inline data as a list of row objects |
artifact_id |
string | Alternatively, an artifact ID from a previous upload |
context |
string | Optional batch-level context for all questions |
forecast_type |
"binary" | "numeric" |
Type of forecast to produce |
output_field |
string | Name of the quantity (required for numeric) |
units |
string | Units for the forecast (required for numeric) |
Provide either data or artifact_id, not both.
Related docs
Guides
- Turn Claude into an Accurate Forecaster — Binary and numeric forecasting for any question about the future
- Find Profitable Polymarket Trades — Screen prediction markets for mispriced contracts
- Forecast Outcomes for a List of Entities — Forecast an outcome for every person, company, country, or product in a list
- Value a Private Company — Sum-of-the-parts valuation before IPO