Find Startups Selling to Frontier AI Labs

A single multi_agent run with return_list turns one prompt into a table. Here it builds a market map of the companies selling data, benchmarks, evaluation sets, and RL environments to frontier AI labs. Six research agents each take a different slice of the market, then their findings are merged and de-duplicated into one list: 52 companies across 8 categories.

Metric	Value
Companies found	52
Categories	8
Effort level	high (6 agents)
Total cost	$1.07
Time	~5.5 minutes

Add FutureSearch to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

Then ask Claude:

Find startups that sell training data, benchmarks, evaluation sets, or RL
environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind.
For each company, capture the category, a product description, known customers,
funding stage, and founding year.

A single agent would miss most of the market. Claude calls FutureSearch's multi_agent tool with return_table set, and points six agents at six different slices so coverage is wide before the results are merged:

Tool: futuresearch_multi_agent
├─ task: "Find startups selling data, benchmarks, eval sets, or RL
│         environments to frontier AI labs..."
├─ effort_level: "high"
├─ return_table: true
├─ directions: [
│    "Human-labeled training data, RLHF preference data, annotation",
│    "AI benchmarks, evaluation frameworks, and eval datasets",
│    "Synthetic data and domain datasets (math, code, science, law)",
│    "Reinforcement learning environments and agent training sims",
│    "Newer niche startups (founded 2021-2025) in the data supply chain",
│    "Well-known, recently funded startups in training data and evals" ]
└─ response_schema: {company_name, category, product_description,
                     known_customers, funding_stage, founded_year}

→ Submitted: multi-agent research starting.
  Session: https://futuresearch.ai/sessions/8ef8345a-3918-4b58-beaf-16e73291e864
  Task ID: 831c...

Tool: futuresearch_progress
→ Running: 6 agents researching (60s elapsed)

...

Tool: futuresearch_progress
→ Completed in 338s. Synthesized 52 rows.

Tool: futuresearch_results
├─ task_id: "831c..."
├─ output_path: "/Users/you/ai_data_startups.csv"
→ Saved 52 rows to /Users/you/ai_data_startups.csv

Add the FutureSearch connector if you haven't already. Then ask Claude:

Find startups that sell training data, benchmarks, evaluation sets, or RL environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind. Return a table with the category, a product description, known customers, funding stage, and founding year for each company.

Results take about 5 to 6 minutes.

Go to futuresearch.ai/app and enter:

Find startups that sell training data, benchmarks, evaluation sets, or RL environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind. Return a table with the category, a product description, known customers, funding stage, and founding year for each company.

return_list=True makes the run emit one row per company found. The response_schema describes a single company; the worker wraps it in a list automatically.

pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here  # Get one at futuresearch.ai/app/api-key

import asyncio
import pandas as pd
from futuresearch import create_session
from futuresearch.ops import multi_agent

SCHEMA = {
    "type": "object",
    "properties": {
        "company_name": {"type": "string"},
        "category": {"type": "string", "description": "e.g. Training Data, "
                     "Benchmarks/Evals, RL Environments, Red Teaming"},
        "product_description": {"type": "string"},
        "known_customers": {"type": "string"},
        "funding_stage": {"type": "string"},
        "founded_year": {"type": "string"},
    },
    "required": ["company_name", "category", "product_description",
                 "known_customers", "funding_stage", "founded_year"],
}

async def main():
    async with create_session(name="AI data-supply startups") as session:
        result = await multi_agent(
            session=session,
            task=(
                "Find startups that sell training data, benchmarks, evaluation "
                "sets, or RL environments to frontier AI labs such as OpenAI, "
                "Anthropic, Google DeepMind, Meta AI, xAI, and Mistral."
            ),
            input=pd.DataFrame(),
            effort_level="high",
            return_list=True,
            response_schema=SCHEMA,
        )
        return result.data

results = asyncio.run(main())
print(f"{len(results)} companies")
print(results["category"].value_counts())

Results

The run returned 52 companies, spread across 8 categories of the AI data supply chain:

Category	Companies
Training Data	13
Benchmarks/Evals	13
RL Environments	8
Red Teaming	5
RLHF/Preference Data	4
Annotation/Labeling	4
Domain Data	3
Synthetic Data	2

A sample of the rows:

Company	Category	Funding stage	Founded
Scale AI	RLHF/Preference Data	Valued ~$14B (2025)	2016
Surge AI	RLHF/Preference Data	Bootstrapped to ~$1.2B revenue (2024)	2020
Mercor	RLHF/Preference Data	Series C, $350M at $10B valuation	2023
Snorkel AI	Training Data	Series D, $100M at $1.3B valuation	2019
Turing	Training Data	Series E, $111M at $2.2B valuation	2018
LMArena	Benchmarks/Evals	$150M at $1.7B valuation	2025
Patronus AI	Benchmarks/Evals	Series A, $17M	2023
METR	Benchmarks/Evals	Nonprofit	2023
Prime Intellect	RL Environments	$15M, led by Founders Fund	2023
Mechanize	RL Environments	$9.1M at $500-750M valuation	2025
Haize Labs	Red Teaming	Seed, $12.5M	2023
Gretel.ai	Synthetic Data	$68M+ raised	2017

What makes the output usable is the spread. The list runs from Scale AI at a ~$14B valuation down to YC-batch seed companies founded in 2025 and 2026, and the founding years span 2008 to 2026. That breadth is the direct result of pointing six agents at six different slices: the agent on "newer niche startups" surfaces YC-stage companies that the agent on "well-known, recently funded startups" would never reach, and the synthesis step merges and de-duplicates the two lists. Every row also carries a known_customers field. Scale AI's, for instance, lists OpenAI, Anthropic, Google DeepMind, Meta AI, Microsoft, and the U.S. Department of Defense, each traceable to the sources the agents read.

Going deeper

Reference: multi_agent, including return_list and directions
Guide: Research a question with a team of agents

Find Startups Selling to Frontier AI Labs

Metric

Value

Companies found

Categories

Effort level

high (6 agents)

Total cost

$1.07

Time

~5.5 minutes

Add FutureSearch to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

Then ask Claude:

Find startups that sell training data, benchmarks, evaluation sets, or RL
environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind.
For each company, capture the category, a product description, known customers,
funding stage, and founding year.

Tool: futuresearch_multi_agent
├─ task: "Find startups selling data, benchmarks, eval sets, or RL
│         environments to frontier AI labs..."
├─ effort_level: "high"
├─ return_table: true
├─ directions: [
│    "Human-labeled training data, RLHF preference data, annotation",
│    "AI benchmarks, evaluation frameworks, and eval datasets",
│    "Synthetic data and domain datasets (math, code, science, law)",
│    "Reinforcement learning environments and agent training sims",
│    "Newer niche startups (founded 2021-2025) in the data supply chain",
│    "Well-known, recently funded startups in training data and evals" ]
└─ response_schema: {company_name, category, product_description,
                     known_customers, funding_stage, founded_year}

→ Submitted: multi-agent research starting.
  Session: https://futuresearch.ai/sessions/8ef8345a-3918-4b58-beaf-16e73291e864
  Task ID: 831c...

Tool: futuresearch_progress
→ Running: 6 agents researching (60s elapsed)

...

Tool: futuresearch_progress
→ Completed in 338s. Synthesized 52 rows.

Tool: futuresearch_results
├─ task_id: "831c..."
├─ output_path: "/Users/you/ai_data_startups.csv"
→ Saved 52 rows to /Users/you/ai_data_startups.csv

Add the FutureSearch connector if you haven't already. Then ask Claude:

Find startups that sell training data, benchmarks, evaluation sets, or RL environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind. Return a table with the category, a product description, known customers, funding stage, and founding year for each company.

Results take about 5 to 6 minutes.

Go to futuresearch.ai/app and enter:

Find startups that sell training data, benchmarks, evaluation sets, or RL environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind. Return a table with the category, a product description, known customers, funding stage, and founding year for each company.

return_list=True makes the run emit one row per company found. The response_schema describes a single company; the worker wraps it in a list automatically.

pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here  # Get one at futuresearch.ai/app/api-key

import asyncio
import pandas as pd
from futuresearch import create_session
from futuresearch.ops import multi_agent

SCHEMA = {
    "type": "object",
    "properties": {
        "company_name": {"type": "string"},
        "category": {"type": "string", "description": "e.g. Training Data, "
                     "Benchmarks/Evals, RL Environments, Red Teaming"},
        "product_description": {"type": "string"},
        "known_customers": {"type": "string"},
        "funding_stage": {"type": "string"},
        "founded_year": {"type": "string"},
    },
    "required": ["company_name", "category", "product_description",
                 "known_customers", "funding_stage", "founded_year"],
}

async def main():
    async with create_session(name="AI data-supply startups") as session:
        result = await multi_agent(
            session=session,
            task=(
                "Find startups that sell training data, benchmarks, evaluation "
                "sets, or RL environments to frontier AI labs such as OpenAI, "
                "Anthropic, Google DeepMind, Meta AI, xAI, and Mistral."
            ),
            input=pd.DataFrame(),
            effort_level="high",
            return_list=True,
            response_schema=SCHEMA,
        )
        return result.data

results = asyncio.run(main())
print(f"{len(results)} companies")
print(results["category"].value_counts())

Results

The run returned 52 companies, spread across 8 categories of the AI data supply chain: