FutureSearch Logofuturesearch
  • Solutions
  • Pricing
  • Research
  • Docs
  • Evals
  • Blog
  • Company
  • Try it for free
FutureSearch Logo

General inquiry? You can reach us at hello@futuresearch.ai.

Company

Team & CareersPressPrivacy PolicyTerms of Service

Developers

SDK DocsAPI ReferenceCase StudiesGitHubSupport

Integrations

Claude CodeCursorChatGPT CodexClaude.ai

Follow Us

X (Twitter)@dschwarz26LinkedIn
FutureSearchdocs
Your research team
Installation
  • All install methods
  • Claude.ai
  • Claude Code
  • Web App
  • Python SDK
  • Skill
Reference
  • API Key
  • classify
  • dedupe
  • forecast
  • merge
  • rank
  • agent_map
  • MCP Server
  • Progress Monitoring
  • Chaining Operations
Guides
  • LLM-Powered Data Labeling
  • Add a Column via Web Research
  • Classify and Label Rows
  • Deduplicate Training Data
  • Error Handling
  • Filter a Dataset Intelligently
  • Find Profitable Prediction Market Trades
  • Forecast Outcomes for a List of Entities
  • Value a Private Company
  • Join Tables Without Shared Keys
  • Rank Data by External Metrics
  • Resolve Duplicate Entities
  • Scale Deduplication to 20K Rows
  • Turn Claude into an Accurate Forecaster
Case Studies
  • Deduplicate Contact Lists
  • Deduplicate CRM Records
  • Enrich Contacts with Company Data
  • Forecast a Sum-of-the-Parts SpaceX IPO Valuation
  • Forecast Anthropic and OpenAI IPO Valuations
  • Forecast Founder Seed Valuations for AI Researchers
  • Forecast When Anthropic and OpenAI Will IPO
  • Fuzzy Match Across Tables
  • Link Records Across Medical Datasets
  • LLM Cost vs. Accuracy
  • Merge Costs and Speed
  • Merge Thousands of Records
  • Multi-Stage Lead Qualification
  • Research and Rank Web Data
  • Run 10,000 LLM Web Research Agents
  • Score Cold Leads via Web Research
  • Score Leads from Fragmented Data
  • Screen 10,000 Rows
  • Screen Job Listings
  • Screen Stocks by Economic Sensitivity
  • Screen Stocks by Investment Thesis
FutureSearchby futuresearch
by futuresearch

Forecast

forecast takes a DataFrame of questions and produces calibrated forecasts for each row. It supports three modes:

  • Binary: probability (0 to 100) of YES/NO questions like "Will X happen?"
  • Numeric: percentile estimates (p10 through p90) for continuous quantities like "What will the price/value/count be?"
  • Date: percentile date estimates (p10 through p90, as YYYY-MM-DD) for timing questions like "When will X happen?"

Accuracy is measured on the public BTF-2 leaderboard and described in the Strategic Reasoning paper. The benchmark questions, ground-truth resolutions, and SOTA agent rationales are released as a Hugging Face dataset.

Forecast types

Binary

from pandas import DataFrame
from futuresearch.ops import forecast

questions = DataFrame([
    {
        "question": "Will the US Federal Reserve cut rates by at least 25bp before July 1, 2027?",
        "resolution_criteria": "Resolves YES if the Fed announces at least one rate cut of 25bp or more at any FOMC meeting between now and June 30, 2027.",
    },
])

result = await forecast(input=questions, forecast_type="binary")
print(result.data[["question", "probability", "rationale"]])
Column Type Description
probability int 0 to 100, calibrated probability of YES resolution. Clamped to [3, 97]; even near-certain outcomes retain residual uncertainty.
rationale str Detailed reasoning with citations from web research

Numeric

result = await forecast(
    input=DataFrame([
        {
            "question": "What will the price of Brent crude oil be on December 31, 2026?",
            "resolution_criteria": "Closing spot price of Brent crude oil (ICE) on Dec 31, 2026.",
        },
    ]),
    forecast_type="numeric",
    output_field="price",
    units="USD per barrel",
)
print(result.data[["price_p10", "price_p25", "price_p50", "price_p75", "price_p90"]])
Column Type Description
{output_field}_p10 … {output_field}_p90 float 10th, 25th, 50th, 75th, and 90th percentile estimates. Monotonically non-decreasing: p10 ≤ p25 ≤ p50 ≤ p75 ≤ p90.
units str The units provided as parameter
rationale str Detailed reasoning with citations

Schema: engine/services/forecast/data_types.py:83-105.

Date

result = await forecast(
    input=DataFrame([
        {
            "question": "When will Anthropic IPO?",
            "resolution_criteria": "Date Anthropic common shares first trade on a public exchange.",
        },
    ]),
    forecast_type="date",
    output_field="ipo_date",
)
print(result.data[["ipo_date_p10", "ipo_date_p50", "ipo_date_p90", "rationale"]])
Column Type Description
{output_field}_p10 … {output_field}_p90 str YYYY-MM-DD percentile estimates, or the literal "never" for percentiles in the indefinite future
rationale str Detailed reasoning with citations

Schema: engine/services/forecast/data_types.py:63-80.

Batch context

When all rows share common framing, pass it via context instead of repeating it in every row:

result = await forecast(
    input=geopolitics_questions,
    forecast_type="binary",
    context="Focus on EU regulatory and diplomatic sources. Assume all questions resolve by end of 2027.",
)

Leave context empty when rows are self-contained. A well-specified question with resolution criteria needs no additional instruction.

Input columns

The input DataFrame should contain at minimum a question column. All columns are passed to the research agents and forecasters.

Column Required Purpose
question Yes The question to forecast
resolution_criteria Recommended Exactly how the outcome is determined
resolution_date Optional When the question closes
background Optional Additional context the forecasters should know

Column names are not enforced. Research agents infer meaning from content, so a column named scenario instead of question works fine.

Parameters

Name Type Description
input DataFrame Rows to forecast, one question per row
forecast_type "binary" | "numeric" | "date" Type of forecast to produce
effort_level "LOW" | "HIGH" | None See Effort and cost below. Defaults to None (auto-resolved by row count).
context str | None Optional batch-level instructions that apply to every row
output_field str | None Name of the quantity being forecast (required for numeric and date, e.g. "price", "launch_date")
units str | None Units for the forecast (required for numeric, e.g. "USD per barrel", "billions USD")
session Session Optional, auto-created if omitted

The forecast_type enum is defined in engine/services/forecast/data_types.py (ForecastType.BINARY | NUMERIC | DATE); effort_level in the same file (ForecastEffortLevel.LOW | HIGH).

Effort and cost

effort_level trades cost for accuracy:

Effort Per-row time Per-row cost
LOW (default for batches) ~3 to 5 min $0.09 to $0.20
HIGH (default for single) ~5 to 10 min ~$1.20

Default effort resolves automatically: HIGH for a single forecast, LOW for many. When effort_level=None, the engine uses HIGH for row_count <= 1 and LOW otherwise (engine/services/forecast/effort.py:18-27). One-off questions get accurate forecasting; large batches stay affordable. See /forecast for worked examples.

Via MCP

MCP tool: futuresearch_forecast

Parameter Type Description
data list[object] Inline data as a list of row objects
artifact_id string Alternatively, an artifact ID from a previous upload
forecast_type "binary" | "numeric" | "date" Type of forecast to produce
effort_level "LOW" | "HIGH" Optional. Defaults: HIGH for a single question, LOW for multiple.
context string Optional batch-level context for all questions
output_field string Name of the quantity (required for numeric and date)
units string Units (required for numeric)

Provide either data or artifact_id, not both. See the MCP server reference for the rest of the lifecycle (progress, results, status).

Related docs

Guides

  • Turn Claude into an Accurate Forecaster: binary, numeric, and date forecasting for any question.
  • Find Profitable Prediction Market Trades: Polymarket and Kalshi screening.
  • Forecast Outcomes for a List of Entities: one outcome per row across a list.
  • Value a Private Company: sum-of-the-parts forecasting.

Case studies

  • Forecast When Anthropic and OpenAI Will IPO: date mode.
  • Forecast Anthropic and OpenAI IPO Valuations: numeric, high effort.
  • Forecast a SpaceX Sum-of-the-Parts Valuation: numeric, multi-segment.
  • Forecast AI Researcher Seed Valuations: numeric across 116 entities.

Long-form research

  • Anthropic and OpenAI IPO Timelines and Valuations
  • A $1.75 Trillion IPO Would Be Overpaying 30% for SpaceX
  • Which AI Researchers Have the Most Valuable Skills?
  • Forecasting Polymarket Questions with AI
  • Strategic Reasoning paper (arXiv:2604.26106)