FutureSearch Logofuturesearch
  • Solutions
  • Pricing
  • Research
  • Docs
  • Evals
  • Blog
  • Company
  • Try it for free
FutureSearch Logo

General inquiry? You can reach us at hello@futuresearch.ai.

Company

Team & CareersPressPrivacy PolicyTerms of Service

Developers

SDK DocsAPI ReferenceCase StudiesGitHubSupport

Integrations

Claude CodeCursorChatGPT CodexClaude.ai

Follow Us

X (Twitter)@dschwarz26LinkedIn
FutureSearchdocs
Your research team
Installation
  • All install methods
  • Claude.ai
  • Claude Code
  • Web App
  • Python SDK
  • Skill
Reference
  • API Key
  • classify
  • dedupe
  • forecast
  • merge
  • rank
  • agent_map
  • MCP Server
  • Progress Monitoring
  • Chaining Operations
Guides
  • LLM-Powered Data Labeling
  • Add a Column via Web Research
  • Classify and Label Rows
  • Deduplicate Training Data
  • Error Handling
  • Filter a Dataset Intelligently
  • Find Profitable Prediction Market Trades
  • Forecast Outcomes for a List of Entities
  • Value a Private Company
  • Join Tables Without Shared Keys
  • Rank Data by External Metrics
  • Resolve Duplicate Entities
  • Scale Deduplication to 20K Rows
  • Turn Claude into an Accurate Forecaster
Case Studies
  • Deduplicate Contact Lists
  • Deduplicate CRM Records
  • Enrich Contacts with Company Data
  • Forecast a Sum-of-the-Parts SpaceX IPO Valuation
  • Forecast Anthropic and OpenAI IPO Valuations
  • Forecast Founder Seed Valuations for AI Researchers
  • Forecast When Anthropic and OpenAI Will IPO
  • Fuzzy Match Across Tables
  • Link Records Across Medical Datasets
  • LLM Cost vs. Accuracy
  • Merge Costs and Speed
  • Merge Thousands of Records
  • Multi-Stage Lead Qualification
  • Research and Rank Web Data
  • Run 10,000 LLM Web Research Agents
  • Score Cold Leads via Web Research
  • Score Leads from Fragmented Data
  • Screen 10,000 Rows
  • Screen Job Listings
  • Screen Stocks by Economic Sensitivity
  • Screen Stocks by Investment Thesis
FutureSearchby futuresearch
by futuresearch

API Reference

Six operations for processing data with LLM-powered web research agents. Each takes a DataFrame and a natural-language instruction.

rank

result = await rank(task=..., input=df, field_name="score")

rank takes a DataFrame and a natural-language scoring criterion, dispatches web research agents to compute a score for each row, and returns the DataFrame sorted by that score. The sort key does not need to exist in your data. Agents derive it at runtime by searching the web, reading pages, and reasoning over what they find.

Full reference → Guides: Sort a Dataset Using Web Data Case Studies: Score Leads from Fragmented Data, Score Leads Without CRM History

dedupe

result = await dedupe(input=df, equivalence_relation="...")

dedupe groups duplicate rows in a DataFrame based on a natural-language equivalence relation, assigns cluster IDs, and selects a canonical row per cluster. The duplicate criterion is semantic and LLM-powered: agents reason over the data and, when needed, search the web for external information to establish equivalence. This handles abbreviations, name variations, job changes, and entity relationships that no string similarity threshold can capture.

Full reference → Guides: Remove Duplicates from ML Training Data, Resolve Duplicate Entities Case Studies: Dedupe CRM Company Records

merge

result = await merge(task=..., left_table=df1, right_table=df2)

merge left-joins two DataFrames using LLM-powered agents to resolve the key mapping instead of requiring exact or fuzzy key matches. Agents resolve semantic relationships by reasoning over the data and, when needed, searching the web for external information to establish matches: subsidiaries, regional names, abbreviations, and product-to-parent-company mappings.

Full reference → Guides: Fuzzy Join Without Matching Keys Case Studies: LLM Merging at Scale, Match Software Vendors to Requirements

classify

result = await classify(
    task="Classify each company by its primary industry sector",
    categories=["Technology", "Finance", "Healthcare", "Energy"],
    input=companies_df,
)

classify assigns each row in a DataFrame to one of the provided categories using web research that scales to the difficulty of the classification. Supports binary (yes/no) and multi-category classification with optional reasoning output. Screening (pass/fail filtering) is a special case: use categories=["yes", "no"].

Full reference → Guides: Filter a DataFrame with LLMs Case Studies: Screen 10,000 Rows, Screen Stocks by Investment Thesis

forecast

result = await forecast(input=questions_df)

forecast takes a DataFrame of questions and produces calibrated forecasts and rationales for each row. Supports binary (probability), numeric (percentile distributions), and date (percentile dates). Accuracy is measured on the public BTF-2 leaderboard; the dataset is on Hugging Face.

Full reference → Blog posts: Automating Forecasting Questions, arXiv paper

agent_map / single_agent

result = await agent_map(task=..., input=df)

single_agent runs one web research agent on a single input (or no input). agent_map runs an agent on every row of a DataFrame in parallel. Both dispatch agents that search the web, read pages, and return structured results. The transform is live web research: agents fetch and synthesize external information to populate new columns.

Full reference → Guides: Add a Column with Web Lookup, Classify and Label Data with an LLM