Find Startups Selling to Frontier AI Labs
A single multi_agent run with return_list turns one prompt into a table. Here it builds a market map of the companies selling data, benchmarks, evaluation sets, and RL environments to frontier AI labs. Six research agents each take a different slice of the market, then their findings are merged and de-duplicated into one list: 52 companies across 8 categories.
| Metric | Value |
|---|---|
| Companies found | 52 |
| Categories | 8 |
| Effort level | high (6 agents) |
| Total cost | $1.07 |
| Time | ~5.5 minutes |
Add FutureSearch to Claude Code if you haven't already:
claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp
Then ask Claude:
Find startups that sell training data, benchmarks, evaluation sets, or RL
environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind.
For each company, capture the category, a product description, known customers,
funding stage, and founding year.
A single agent would miss most of the market. Claude calls FutureSearch's multi_agent tool with return_table set, and points six agents at six different slices so coverage is wide before the results are merged:
Tool: futuresearch_multi_agent
├─ task: "Find startups selling data, benchmarks, eval sets, or RL
│ environments to frontier AI labs..."
├─ effort_level: "high"
├─ return_table: true
├─ directions: [
│ "Human-labeled training data, RLHF preference data, annotation",
│ "AI benchmarks, evaluation frameworks, and eval datasets",
│ "Synthetic data and domain datasets (math, code, science, law)",
│ "Reinforcement learning environments and agent training sims",
│ "Newer niche startups (founded 2021-2025) in the data supply chain",
│ "Well-known, recently funded startups in training data and evals" ]
└─ response_schema: {company_name, category, product_description,
known_customers, funding_stage, founded_year}
→ Submitted: multi-agent research starting.
Session: https://futuresearch.ai/sessions/8ef8345a-3918-4b58-beaf-16e73291e864
Task ID: 831c...
Tool: futuresearch_progress
→ Running: 6 agents researching (60s elapsed)
...
Tool: futuresearch_progress
→ Completed in 338s. Synthesized 52 rows.
Tool: futuresearch_results
├─ task_id: "831c..."
├─ output_path: "/Users/you/ai_data_startups.csv"
→ Saved 52 rows to /Users/you/ai_data_startups.csv
Add the FutureSearch connector if you haven't already. Then ask Claude:
Find startups that sell training data, benchmarks, evaluation sets, or RL environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind. Return a table with the category, a product description, known customers, funding stage, and founding year for each company.
Results take about 5 to 6 minutes.
Go to futuresearch.ai/app and enter:
Find startups that sell training data, benchmarks, evaluation sets, or RL environments to frontier AI labs like OpenAI, Anthropic, and Google DeepMind. Return a table with the category, a product description, known customers, funding stage, and founding year for each company.
return_list=True makes the run emit one row per company found. The response_schema describes a single company; the worker wraps it in a list automatically.
pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here # Get one at futuresearch.ai/app/api-key
import asyncio
import pandas as pd
from futuresearch import create_session
from futuresearch.ops import multi_agent
SCHEMA = {
"type": "object",
"properties": {
"company_name": {"type": "string"},
"category": {"type": "string", "description": "e.g. Training Data, "
"Benchmarks/Evals, RL Environments, Red Teaming"},
"product_description": {"type": "string"},
"known_customers": {"type": "string"},
"funding_stage": {"type": "string"},
"founded_year": {"type": "string"},
},
"required": ["company_name", "category", "product_description",
"known_customers", "funding_stage", "founded_year"],
}
async def main():
async with create_session(name="AI data-supply startups") as session:
result = await multi_agent(
session=session,
task=(
"Find startups that sell training data, benchmarks, evaluation "
"sets, or RL environments to frontier AI labs such as OpenAI, "
"Anthropic, Google DeepMind, Meta AI, xAI, and Mistral."
),
input=pd.DataFrame(),
effort_level="high",
return_list=True,
response_schema=SCHEMA,
)
return result.data
results = asyncio.run(main())
print(f"{len(results)} companies")
print(results["category"].value_counts())
Results
The run returned 52 companies, spread across 8 categories of the AI data supply chain:
| Category | Companies |
|---|---|
| Training Data | 13 |
| Benchmarks/Evals | 13 |
| RL Environments | 8 |
| Red Teaming | 5 |
| RLHF/Preference Data | 4 |
| Annotation/Labeling | 4 |
| Domain Data | 3 |
| Synthetic Data | 2 |
A sample of the rows:
| Company | Category | Funding stage | Founded |
|---|---|---|---|
| Scale AI | RLHF/Preference Data | Valued ~$14B (2025) | 2016 |
| Surge AI | RLHF/Preference Data | Bootstrapped to ~$1.2B revenue (2024) | 2020 |
| Mercor | RLHF/Preference Data | Series C, $350M at $10B valuation | 2023 |
| Snorkel AI | Training Data | Series D, $100M at $1.3B valuation | 2019 |
| Turing | Training Data | Series E, $111M at $2.2B valuation | 2018 |
| LMArena | Benchmarks/Evals | $150M at $1.7B valuation | 2025 |
| Patronus AI | Benchmarks/Evals | Series A, $17M | 2023 |
| METR | Benchmarks/Evals | Nonprofit | 2023 |
| Prime Intellect | RL Environments | $15M, led by Founders Fund | 2023 |
| Mechanize | RL Environments | $9.1M at $500-750M valuation | 2025 |
| Haize Labs | Red Teaming | Seed, $12.5M | 2023 |
| Gretel.ai | Synthetic Data | $68M+ raised | 2017 |
What makes the output usable is the spread. The list runs from Scale AI at a ~$14B valuation down to YC-batch seed companies founded in 2025 and 2026, and the founding years span 2008 to 2026. That breadth is the direct result of pointing six agents at six different slices: the agent on "newer niche startups" surfaces YC-stage companies that the agent on "well-known, recently funded startups" would never reach, and the synthesis step merges and de-duplicates the two lists. Every row also carries a known_customers field. Scale AI's, for instance, lists OpenAI, Anthropic, Google DeepMind, Meta AI, Microsoft, and the U.S. Department of Defense, each traceable to the sources the agents read.
Going deeper
- Reference: multi_agent, including
return_listanddirections - Guide: Research a question with a team of agents