FutureSearch Logofuturesearch
  • Solutions
  • Pricing
  • Research
  • Docs
  • Evals
  • Blog
  • Company
  • LiteLLM Checker
  • Get Researchers
FutureSearch Logo

General inquiry? You can reach us at hello@futuresearch.ai.

Company

Team & CareersPressPrivacy PolicyTerms of Service

Developers

SDK DocsAPI ReferenceCase StudiesGitHubSupport

Integrations

Claude CodeCursorChatGPT CodexClaude.ai

Follow Us

X (Twitter)@dschwarz26LinkedIn
FutureSearchdocs
Your research team
Installation
  • All install methods
  • Claude.ai
  • Claude Cowork
  • Claude Code
  • Web App
  • Python SDK
  • Skill
  • MCP Server
Reference
  • API Key
  • classify
  • dedupe
  • forecast
  • merge
  • rank
  • agent_map
  • Progress Monitoring
  • Chaining Operations
Guides
  • LLM-Powered Data Labeling
  • Add a Column via Web Research
  • Classify and Label Rows
  • Deduplicate Training Data
  • Filter a Dataset Intelligently
  • Find Profitable Polymarket Trades
  • Forecast Outcomes for a List of Entities
  • Value a Private Company
  • Join Tables Without Shared Keys
  • Rank Data by External Metrics
  • Resolve Duplicate Entities
  • Scale Deduplication to 20K Rows
  • Turn Claude into an Accurate Forecaster
Case Studies
  • Deduplicate Contact Lists
  • Deduplicate CRM Records
  • Enrich Contacts with Company Data
  • Fuzzy Match Across Tables
  • Link Records Across Medical Datasets
  • LLM Cost vs. Accuracy
  • Merge Costs and Speed
  • Merge Thousands of Records
  • Multi-Stage Lead Qualification
  • Research and Rank Web Data
  • Run 10,000 LLM Web Research Agents
  • Score Cold Leads via Web Research
  • Score Leads from Fragmented Data
  • Screen 10,000 Rows
  • Screen Job Listings
  • Screen Stocks by Economic Sensitivity
  • Screen Stocks by Investment Thesis
FutureSearchby futuresearch
by futuresearch

Forecast

forecast takes a DataFrame of questions and produces calibrated forecasts for each row. It supports two modes:

  • Binary: Forecasts the probability (0–100) of YES/NO questions like "Will X happen?"
  • Numeric: Forecasts percentile estimates (p10–p90) for continuous quantities like "What will the price/value/count be?"

The approach is validated against FutureSearch's past-casting environment of 1500 hard forecasting questions and 15M research documents. See more at Automating Forecasting Questions and arXiv:2506.21558.

Examples

Binary forecast

from pandas import DataFrame
from futuresearch.ops import forecast

questions = DataFrame([
    {
        "question": "Will the US Federal Reserve cut rates by at least 25bp before July 1, 2027?",
        "resolution_criteria": "Resolves YES if the Fed announces at least one rate cut of 25bp or more at any FOMC meeting between now and June 30, 2027.",
    },
])

result = await forecast(input=questions, forecast_type="binary")
print(result.data[["question", "probability", "rationale"]])

The output DataFrame contains the original columns plus probability (int, 0–100) and rationale (str).

Numeric forecast

from pandas import DataFrame
from futuresearch.ops import forecast

questions = DataFrame([
    {
        "question": "What will the price of Brent crude oil be on December 31, 2026?",
        "resolution_criteria": "The closing spot price of Brent crude oil (ICE) on Dec 31, 2026, in USD/barrel.",
        "resolution_date": "2026-12-31",
    },
])

result = await forecast(
    input=questions,
    forecast_type="numeric",
    output_field="price",
    units="USD per barrel",
)
print(result.data[["question", "price_p10", "price_p25", "price_p50", "price_p75", "price_p90"]])

The output DataFrame contains the original columns plus {output_field}_p10 through {output_field}_p90 (float), units (str), and rationale (str). Percentiles are monotonically non-decreasing.

Batch context

When all rows share common framing, pass it via context instead of repeating it in every row:

result = await forecast(
    input=geopolitics_questions,
    forecast_type="binary",
    context="Focus on EU regulatory and diplomatic sources. Assume all questions resolve by end of 2027.",
)

Leave context empty when rows are self-contained—a well-specified question with resolution criteria needs no additional instruction.

Input columns

The input DataFrame should contain at minimum a question column. All columns are passed to the research agents and forecasters.

Column Required Purpose
question Yes The question to forecast
resolution_criteria Recommended Exactly how the outcome is determined
resolution_date Optional When the question closes
background Optional Additional context the forecasters should know

Column names are not enforced—research agents infer meaning from content. A column named scenario instead of question works fine.

Parameters

Name Type Description
input DataFrame Rows to forecast, one question per row
forecast_type "binary" | "numeric" Type of forecast to produce
context str | None Optional batch-level instructions that apply to every row
output_field str | None Name of the quantity being forecast (required for numeric, e.g. "price", "valuation")
units str | None Units for the forecast (required for numeric, e.g. "USD per barrel", "billions USD")
session Session Optional, auto-created if omitted

Output

Binary (forecast_type="binary")

Two columns are added to each input row:

Column Type Description
probability int 0–100, calibrated probability of YES resolution
rationale str Detailed reasoning with citations from web research

Probabilities are clamped to [3, 97]—even near-certain outcomes retain residual uncertainty.

Numeric (forecast_type="numeric")

Seven columns are added to each input row:

Column Type Description
{output_field}_p10 float 10th percentile estimate
{output_field}_p25 float 25th percentile estimate
{output_field}_p50 float 50th percentile (median) estimate
{output_field}_p75 float 75th percentile estimate
{output_field}_p90 float 90th percentile estimate
units str The units provided as parameter
rationale str Detailed reasoning with citations from web research

Percentiles are monotonically non-decreasing: p10 ≤ p25 ≤ p50 ≤ p75 ≤ p90.

Performance

Rows Time Cost
1 ~5 min ~$0.60
5 ~6 min ~$3
20 ~10 min ~$12

Via MCP

MCP tool: futuresearch_forecast

Parameter Type Description
data list[object] Inline data as a list of row objects
artifact_id string Alternatively, an artifact ID from a previous upload
context string Optional batch-level context for all questions
forecast_type "binary" | "numeric" Type of forecast to produce
output_field string Name of the quantity (required for numeric)
units string Units for the forecast (required for numeric)

Provide either data or artifact_id, not both.

Related docs

Guides

  • Turn Claude into an Accurate Forecaster — Binary and numeric forecasting for any question about the future
  • Find Profitable Polymarket Trades — Screen prediction markets for mispriced contracts
  • Forecast Outcomes for a List of Entities — Forecast an outcome for every person, company, country, or product in a list
  • Value a Private Company — Sum-of-the-parts valuation before IPO

Blog posts

  • Automating Forecasting Questions
  • Anthropic and OpenAI IPO Timelines and Valuations
  • Forecasting Polymarket Questions with AI
  • Which AI Researchers Have the Most Valuable Skills?
  • A $1.75 Trillion IPO Would Be Overpaying 30% for SpaceX
  • arXiv paper: Automated Forecasting