FutureSearch Logofuturesearch
  • Solutions
  • Pricing
  • Research
  • Docs
  • Evals
  • Blog
  • Company
  • Try it for free
FutureSearch Logo

General inquiry? You can reach us at hello@futuresearch.ai.

Company

Team & CareersPressPrivacy PolicyTerms of Service

Developers

SDK DocsAPI ReferenceCase StudiesGitHubSupport

Integrations

Claude CodeCursorChatGPT CodexClaude.ai

Follow Us

X (Twitter)@dschwarz26LinkedIn
FutureSearchdocs
Your research team
Installation
  • All install methods
  • Claude.ai
  • Claude Code
  • Web App
  • Python SDK
  • Skill
Reference
  • API Key
  • forecast
  • multi_agent
  • agent_map
  • rank
  • classify
  • merge
  • dedupe
  • MCP Server
  • Progress Monitoring
  • Chaining Operations
Guides
  • LLM-Powered Data Labeling
  • Add a Column via Web Research
  • Classify and Label Rows
  • Deduplicate Training Data
  • Error Handling
  • Filter a Dataset Intelligently
  • Find Profitable Prediction Market Trades
  • Forecast Outcomes for a List of Entities
  • Value a Private Company
  • Join Tables Without Shared Keys
  • Rank Data by External Metrics
  • Research a Question with a Team of Agents
  • Resolve Duplicate Entities
  • Scale Deduplication to 20K Rows
  • Turn Claude into an Accurate Forecaster
Case Studies
  • Deduplicate Contact Lists
  • Deduplicate CRM Records
  • Enrich Contacts with Company Data
  • Find Startups Selling to Frontier AI Labs
  • Forecast a Sum-of-the-Parts SpaceX IPO Valuation
  • Forecast Anthropic and OpenAI IPO Valuations
  • Forecast Founder Seed Valuations for AI Researchers
  • Forecast When Anthropic and OpenAI Will IPO
  • Fuzzy Match Across Tables
  • Link Records Across Medical Datasets
  • LLM Cost vs. Accuracy
  • Merge Costs and Speed
  • Merge Thousands of Records
  • Multi-Stage Lead Qualification
  • Research and Rank Web Data
  • Research Formal Verification for AI
  • Run 10,000 LLM Web Research Agents
  • Score Cold Leads via Web Research
  • Score Leads from Fragmented Data
  • Screen 10,000 Rows
  • Screen Job Listings
  • Screen Stocks by Economic Sensitivity
  • Screen Stocks by Investment Thesis
FutureSearchby futuresearch
by futuresearch

Find Startups Selling to Frontier AI Labs

A single multi_agent run with return_list turns one prompt into a table. Here it builds a market map of the companies selling data, benchmarks, evaluation sets, and RL environments to frontier AI labs. Six research agents each take a different slice of the market, then their findings are merged and de-duplicated into one list: 52 companies across 8 categories.

MetricValue
Companies found52
Categories8
Effort levelhigh (6 agents)
Total cost$1.07
Time~5.5 minutes

Add FutureSearch to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

Then ask Claude:

Find startups that sell training data, benchmarks, evaluation sets, or RL
environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind.
For each company, capture the category, a product description, known customers,
funding stage, and founding year.

A single agent would miss most of the market. Claude calls FutureSearch's multi_agent tool with return_table set, and points six agents at six different slices so coverage is wide before the results are merged:

Tool: futuresearch_multi_agent
├─ task: "Find startups selling data, benchmarks, eval sets, or RL
│         environments to frontier AI labs..."
├─ effort_level: "high"
├─ return_table: true
├─ directions: [
│    "Human-labeled training data, RLHF preference data, annotation",
│    "AI benchmarks, evaluation frameworks, and eval datasets",
│    "Synthetic data and domain datasets (math, code, science, law)",
│    "Reinforcement learning environments and agent training sims",
│    "Newer niche startups (founded 2021-2025) in the data supply chain",
│    "Well-known, recently funded startups in training data and evals" ]
└─ response_schema: {company_name, category, product_description,
                     known_customers, funding_stage, founded_year}

→ Submitted: multi-agent research starting.
  Session: https://futuresearch.ai/sessions/8ef8345a-3918-4b58-beaf-16e73291e864
  Task ID: 831c...

Tool: futuresearch_progress
→ Running: 6 agents researching (60s elapsed)

...

Tool: futuresearch_progress
→ Completed in 338s. Synthesized 52 rows.

Tool: futuresearch_results
├─ task_id: "831c..."
├─ output_path: "/Users/you/ai_data_startups.csv"
→ Saved 52 rows to /Users/you/ai_data_startups.csv

Add the FutureSearch connector if you haven't already. Then ask Claude:

Find startups that sell training data, benchmarks, evaluation sets, or RL environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind. Return a table with the category, a product description, known customers, funding stage, and founding year for each company.

Results take about 5 to 6 minutes.

Go to futuresearch.ai/app and enter:

Find startups that sell training data, benchmarks, evaluation sets, or RL environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind. Return a table with the category, a product description, known customers, funding stage, and founding year for each company.

return_list=True makes the run emit one row per company found. The response_schema describes a single company; the worker wraps it in a list automatically.

pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here  # Get one at futuresearch.ai/app/api-key
import asyncio
import pandas as pd
from futuresearch import create_session
from futuresearch.ops import multi_agent

SCHEMA = {
    "type": "object",
    "properties": {
        "company_name": {"type": "string"},
        "category": {"type": "string", "description": "e.g. Training Data, "
                     "Benchmarks/Evals, RL Environments, Red Teaming"},
        "product_description": {"type": "string"},
        "known_customers": {"type": "string"},
        "funding_stage": {"type": "string"},
        "founded_year": {"type": "string"},
    },
    "required": ["company_name", "category", "product_description",
                 "known_customers", "funding_stage", "founded_year"],
}

async def main():
    async with create_session(name="AI data-supply startups") as session:
        result = await multi_agent(
            session=session,
            task=(
                "Find startups that sell training data, benchmarks, evaluation "
                "sets, or RL environments to frontier AI labs such as OpenAI, "
                "Anthropic, Google DeepMind, Meta AI, xAI, and Mistral."
            ),
            input=pd.DataFrame(),
            effort_level="high",
            return_list=True,
            response_schema=SCHEMA,
        )
        return result.data

results = asyncio.run(main())
print(f"{len(results)} companies")
print(results["category"].value_counts())

Results

The run returned 52 companies, spread across 8 categories of the AI data supply chain:

CategoryCompanies
Training Data13
Benchmarks/Evals13
RL Environments8
Red Teaming5
RLHF/Preference Data4
Annotation/Labeling4
Domain Data3
Synthetic Data2

A sample of the rows:

CompanyCategoryFunding stageFounded
Scale AIRLHF/Preference DataValued ~$14B (2025)2016
Surge AIRLHF/Preference DataBootstrapped to ~$1.2B revenue (2024)2020
MercorRLHF/Preference DataSeries C, $350M at $10B valuation2023
Snorkel AITraining DataSeries D, $100M at $1.3B valuation2019
TuringTraining DataSeries E, $111M at $2.2B valuation2018
LMArenaBenchmarks/Evals$150M at $1.7B valuation2025
Patronus AIBenchmarks/EvalsSeries A, $17M2023
METRBenchmarks/EvalsNonprofit2023
Prime IntellectRL Environments$15M, led by Founders Fund2023
MechanizeRL Environments$9.1M at $500-750M valuation2025
Haize LabsRed TeamingSeed, $12.5M2023
Gretel.aiSynthetic Data$68M+ raised2017

What makes the output usable is the spread. The list runs from Scale AI at a ~$14B valuation down to YC-batch seed companies founded in 2025 and 2026, and the founding years span 2008 to 2026. That breadth is the direct result of pointing six agents at six different slices: the agent on "newer niche startups" surfaces YC-stage companies that the agent on "well-known, recently funded startups" would never reach, and the synthesis step merges and de-duplicates the two lists. Every row also carries a known_customers field. Scale AI's, for instance, lists OpenAI, Anthropic, Google DeepMind, Meta AI, Microsoft, and the U.S. Department of Defense, each traceable to the sources the agents read.

Going deeper

  • Reference: multi_agent, including return_list and directions
  • Guide: Research a question with a team of agents