Merge Thousands of Records

Matching 2,246 people to their personal websites requires understanding names, affiliations, and URL patterns at a scale where each match may need web research to verify. This case study demonstrates semantic record matching at production scale.

Metric	Value
Rows processed	2,246
Matched	2,243 (99.9%)
Total cost	$35.41
Time	12.5 minutes

Add FutureSearch to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

With both CSVs in your working directory, tell Claude:

Merge the people CSV with the websites CSV. Match each person to their
personal website(s).

Claude calls FutureSearch's merge MCP tool:

Tool: futuresearch_merge
├─ task: "Match each person to their website(s)."
├─ left_csv: "/Users/you/people.csv"
└─ right_csv: "/Users/you/websites.csv"

→ Submitted: 2,246 rows for merging.
  Session: https://futuresearch.ai/sessions/2a929529-2d92-4410-a6a7-ce8713c5d465
  Task ID: 2a92...

Tool: futuresearch_progress
├─ task_id: "2a92..."
→ Running: 0/2246 complete (30s elapsed)

...

Tool: futuresearch_progress
→ Completed: 2246/2246 (0 failed) in 747s.

Tool: futuresearch_results
├─ task_id: "2a92..."
├─ output_path: "/Users/you/people_with_websites.csv"
→ Saved 2246 rows to /Users/you/people_with_websites.csv

2,243 of 2,246 matched (99.9%). View the session.

Add the FutureSearch connector if you haven't already. Then upload both the people CSV and websites CSV and ask Claude:

Merge the people CSV with the websites CSV. Match each person to their personal website(s).

Go to futuresearch.ai/app, upload both the people CSV and websites CSV, and enter:

Merge the people CSV with the websites CSV. Match each person to their personal website(s).

pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here  # Get one at futuresearch.ai/app/api-key

import asyncio
import pandas as pd
from futuresearch import create_session
from futuresearch.ops import merge

left_df = pd.read_csv("merge_websites_input_left_2246.csv")
right_df = pd.read_csv("merge_websites_input_right_2246.csv")

async def main():
    async with create_session(name="Website Matching") as session:
        result = await merge(
            session=session,
            task="Match each person to their website(s).",
            left_table=left_df,
            right_table=right_df,
        )
        return result.data

merged = asyncio.run(main())

Results

Most matches resolved via LLM reasoning on name/email/URL patterns. Harder cases triggered automatic web search to verify person-to-website relationships. At this scale, 54M tokens were consumed across 4,233 LLM requests.

Cost grows super-linearly with row count because each additional row increases the candidate pool for every match:

Rows	Cost
100	$0.00
200	$0.14
400	$0.29
800	$2.32
1,600	$16.60
2,246	$26.80

Merge Thousands of Records

Metric	Value
Rows processed	2,246
Matched	2,243 (99.9%)
Total cost	$35.41
Time	12.5 minutes

Add FutureSearch to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

With both CSVs in your working directory, tell Claude:

Merge the people CSV with the websites CSV. Match each person to their
personal website(s).

Claude calls FutureSearch's merge MCP tool:

Tool: futuresearch_merge
├─ task: "Match each person to their website(s)."
├─ left_csv: "/Users/you/people.csv"
└─ right_csv: "/Users/you/websites.csv"

→ Submitted: 2,246 rows for merging.
  Session: https://futuresearch.ai/sessions/2a929529-2d92-4410-a6a7-ce8713c5d465
  Task ID: 2a92...

Tool: futuresearch_progress
├─ task_id: "2a92..."
→ Running: 0/2246 complete (30s elapsed)

...

Tool: futuresearch_progress
→ Completed: 2246/2246 (0 failed) in 747s.

Tool: futuresearch_results
├─ task_id: "2a92..."
├─ output_path: "/Users/you/people_with_websites.csv"
→ Saved 2246 rows to /Users/you/people_with_websites.csv

2,243 of 2,246 matched (99.9%). View the session.

Add the FutureSearch connector if you haven't already. Then upload both the people CSV and websites CSV and ask Claude:

Merge the people CSV with the websites CSV. Match each person to their personal website(s).

Go to futuresearch.ai/app, upload both the people CSV and websites CSV, and enter:

Merge the people CSV with the websites CSV. Match each person to their personal website(s).

pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here  # Get one at futuresearch.ai/app/api-key

import asyncio
import pandas as pd
from futuresearch import create_session
from futuresearch.ops import merge

left_df = pd.read_csv("merge_websites_input_left_2246.csv")
right_df = pd.read_csv("merge_websites_input_right_2246.csv")

async def main():
    async with create_session(name="Website Matching") as session:
        result = await merge(
            session=session,
            task="Match each person to their website(s).",
            left_table=left_df,
            right_table=right_df,
        )
        return result.data

merged = asyncio.run(main())

Results

Cost grows super-linearly with row count because each additional row increases the candidate pool for every match:

Rows	Cost
100	$0.00
200	$0.14
400	$0.29
800	$2.32
1,600	$16.60
2,246	$26.80