Fuzzy Match Across Tables

Matching companies to stock tickers, or CEOs to their companies, requires a cascade of strategies from exact matching through LLM reasoning to web search. This case study runs 5 merge experiments on 438 S&P 500 companies, testing each strategy independently.

Metric	Value
Total merges	5
Rows per merge	438
Total cost	$3.67
Total time	7.1 minutes

Add FutureSearch to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

With the company CSVs in your working directory, tell Claude to run each experiment. For the company-to-ticker merge:

Merge company_info.csv with valuations.csv. The first table has company names,
the second has stock tickers. Match companies to their stock tickers.

Claude calls FutureSearch's merge MCP tool:

Tool: futuresearch_merge
├─ task: "Merge the tables based on company name and ticker"
├─ left_csv: "/Users/you/company_info.csv"
└─ right_csv: "/Users/you/valuations.csv"

→ Submitted: 438 rows for merging.
  Session: https://futuresearch.ai/sessions/d7819b7e-c48d-49e5-9f6e-55d972b85467

...

Tool: futuresearch_results
→ Saved 438 rows to /Users/you/merged.csv

Add the FutureSearch connector if you haven't already. Then upload company_info.csv and valuations.csv and ask Claude:

Merge company_info.csv with valuations.csv. The first table has company names, the second has stock tickers. Match companies to their stock tickers.

Go to futuresearch.ai/app, upload company_info.csv and valuations.csv, and enter:

Merge company_info.csv with valuations.csv. The first table has company names, the second has stock tickers. Match companies to their stock tickers.

pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here  # Get one at futuresearch.ai/app/api-key

import asyncio
import pandas as pd
from futuresearch import create_session
from futuresearch.ops import merge

companies = pd.read_csv("company_info.csv")
valuations = pd.read_csv("valuations.csv")

async def main():
    # Experiment 1: Clean data (exact matches)
    async with create_session(name="Exact Match") as session:
        result = await merge(
            session=session,
            task="Merge the tables on company name",
            left_table=companies,
            right_table=valuations,
            merge_on_left="company",
            merge_on_right="company",
        )

    # Experiment 2: Company name to ticker (LLM match)
    async with create_session(name="LLM Match") as session:
        result = await merge(
            session=session,
            task="Merge the tables based on company name and ticker",
            left_table=companies,
            right_table=valuations,
        )

asyncio.run(main())

Results

Results across all 5 experiments:

Experiment	Accuracy	Cost	Time
0% noise (baseline)	100%	$0.00	6s
5% character corruption	100%	$0.10	23s
10% character corruption	100%	$0.34	43s
Company name to ticker (LLM)	100%	$1.01	203s
CEO name to company (Web)	96.3%	$2.22	151s

The cascade escalates automatically: exact matches are free, fuzzy matches handle typos for free, LLM reasoning handles semantic matches at ~$0.002/row, and web search is used only for stale or obscure data at ~$0.01/row. For the 10% noise experiment, 26.5% of rows matched exactly, 30.8% via fuzzy matching (both free), and only 42.7% required LLM reasoning.

Fuzzy Match Across Tables

Metric	Value
Total merges	5
Rows per merge	438
Total cost	$3.67
Total time	7.1 minutes

Add FutureSearch to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

With the company CSVs in your working directory, tell Claude to run each experiment. For the company-to-ticker merge:

Merge company_info.csv with valuations.csv. The first table has company names,
the second has stock tickers. Match companies to their stock tickers.

Claude calls FutureSearch's merge MCP tool:

Tool: futuresearch_merge
├─ task: "Merge the tables based on company name and ticker"
├─ left_csv: "/Users/you/company_info.csv"
└─ right_csv: "/Users/you/valuations.csv"

→ Submitted: 438 rows for merging.
  Session: https://futuresearch.ai/sessions/d7819b7e-c48d-49e5-9f6e-55d972b85467

...

Tool: futuresearch_results
→ Saved 438 rows to /Users/you/merged.csv

Add the FutureSearch connector if you haven't already. Then upload company_info.csv and valuations.csv and ask Claude:

Merge company_info.csv with valuations.csv. The first table has company names, the second has stock tickers. Match companies to their stock tickers.

Go to futuresearch.ai/app, upload company_info.csv and valuations.csv, and enter:

Merge company_info.csv with valuations.csv. The first table has company names, the second has stock tickers. Match companies to their stock tickers.

pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here  # Get one at futuresearch.ai/app/api-key

import asyncio
import pandas as pd
from futuresearch import create_session
from futuresearch.ops import merge

companies = pd.read_csv("company_info.csv")
valuations = pd.read_csv("valuations.csv")

async def main():
    # Experiment 1: Clean data (exact matches)
    async with create_session(name="Exact Match") as session:
        result = await merge(
            session=session,
            task="Merge the tables on company name",
            left_table=companies,
            right_table=valuations,
            merge_on_left="company",
            merge_on_right="company",
        )

    # Experiment 2: Company name to ticker (LLM match)
    async with create_session(name="LLM Match") as session:
        result = await merge(
            session=session,
            task="Merge the tables based on company name and ticker",
            left_table=companies,
            right_table=valuations,
        )

asyncio.run(main())

Results

Results across all 5 experiments:

Experiment	Accuracy	Cost	Time
0% noise (baseline)	100%	$0.00	6s
5% character corruption	100%	$0.10	23s
10% character corruption	100%	$0.34	43s
Company name to ticker (LLM)	100%	$1.01	203s
CEO name to company (Web)	96.3%	$2.22	151s