FutureSearch Logofuturesearch
  • Blog
  • Solutions
  • Research
  • Docs
  • Evals
  • Company
  • Get Researchers
FutureSearch Logo

General inquiry? You can reach us at hello@futuresearch.ai.

Company

Team & CareersPressPrivacy PolicyTerms of Service

Developers

SDK DocsAPI ReferenceCase StudiesGitHub

Follow Us

X (Twitter)@dschwarz26LinkedIn
FutureSearchdocs
Your research team
Installation
  • All install methods
  • Claude.ai
  • Claude Cowork
  • Claude Code
  • Web App
  • Python SDK
  • Skill
  • MCP Server
Reference
  • API Key
  • classify
  • dedupe
  • forecast
  • merge
  • rank
  • agent_map
  • screen
  • Progress Monitoring
  • Chaining Operations
Guides
  • LLM-Powered Data Labeling
  • Add a Column via Web Research
  • Classify and Label Rows
  • Deduplicate Training Data
  • Filter a Dataset Intelligently
  • Join Tables Without Shared Keys
  • Rank Data by External Metrics
  • Resolve Duplicate Entities
  • Scale Deduplication to 20K Rows
Case Studies
  • Deduplicate Contact Lists
  • Deduplicate CRM Records
  • Enrich Contacts with Company Data
  • Fuzzy Match Across Tables
  • Link Records Across Medical Datasets
  • LLM Cost vs. Accuracy
  • Merge Costs and Speed
  • Merge Thousands of Records
  • Multi-Stage Lead Qualification
  • Research and Rank Web Data
  • Run 10,000 LLM Web Research Agents
  • Score Cold Leads via Web Research
  • Score Leads from Fragmented Data
  • Screen 10,000 Rows
  • Screen Job Listings
  • Screen Stocks by Economic Sensitivity
  • Screen Stocks by Investment Thesis
FutureSearchby futuresearch
by futuresearch

Merge Thousands of Records

Go to futuresearch.ai/app, upload both the people CSV and websites CSV, and enter:

Merge the people CSV with the websites CSV. Match each person to their personal website(s).

2,243 of 2,246 matched (99.9%). Results take about 12.5 minutes.

Add the everyrow connector if you haven't already. Then upload both the people CSV and websites CSV and ask Claude:

Merge the people CSV with the websites CSV. Match each person to their personal website(s).

2,243 of 2,246 matched (99.9%). Results take about 12.5 minutes.

Claude Code is great at matching a person to their website using web search. It can cross-reference names, email domains, and institutions. Doing that for 2,246 people, where each match requires understanding names, affiliations, and URL patterns, is more web research than a single session can support.

Here, we get Claude Code to match people to their personal websites at scale.

MetricValue
Rows processed2,246
Matched2,243 (99.9%)
Total cost$35.41
Time12.5 minutes

Add everyrow to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

With both CSVs in your working directory, tell Claude:

Merge the people CSV with the websites CSV. Match each person to their
personal website(s).

Claude calls everyrow's merge MCP tool:

Tool: everyrow_merge
├─ task: "Match each person to their website(s)."
├─ left_csv: "/Users/you/people.csv"
└─ right_csv: "/Users/you/websites.csv"

→ Submitted: 2,246 rows for merging.
  Session: https://futuresearch.ai/sessions/2a929529-2d92-4410-a6a7-ce8713c5d465
  Task ID: 2a92...

Tool: everyrow_progress
├─ task_id: "2a92..."
→ Running: 0/2246 complete (30s elapsed)

...

Tool: everyrow_progress
→ Completed: 2246/2246 (0 failed) in 747s.

Tool: everyrow_results
├─ task_id: "2a92..."
├─ output_path: "/Users/you/people_with_websites.csv"
→ Saved 2246 rows to /Users/you/people_with_websites.csv

2,243 of 2,246 matched (99.9%). View the session.

Most matches resolved via LLM reasoning on name/email/URL patterns. Harder cases triggered automatic web search to verify person-to-website relationships. At this scale, 54M tokens were consumed across 4,233 LLM requests.

The everyrow SDK's merge() scales to thousands of rows. This notebook demonstrates matching 2,246 people to personal websites, showing how cost grows with scale.

MetricValue
Rows processed2,246
Cost$26.80
pip install everyrow
export EVERYROW_API_KEY=your_key_here  # Get one at futuresearch.ai/api-key
import asyncio
import pandas as pd
from everyrow import create_session
from everyrow.ops import merge

left_df = pd.read_csv("merge_websites_input_left_2246.csv")
right_df = pd.read_csv("merge_websites_input_right_2246.csv")

async def main():
    async with create_session(name="Website Matching") as session:
        result = await merge(
            session=session,
            task="Match each person to their website(s).",
            left_table=left_df,
            right_table=right_df,
        )
        return result.data

merged = asyncio.run(main())

Cost grows super-linearly with row count because each additional row increases the candidate pool for every match:

RowsCost
100$0.00
200$0.14
400$0.29
800$2.32
1,600$16.60
2,246$26.80

Most matches resolved by LLM reasoning on name/email/URL patterns. Harder cases trigger automatic web search fallback.