Merge Costs and Speed

Run 5 merge experiments to empirically measure the cost cascade across increasing match difficulty. Exact and fuzzy matches are free; only semantic matches that require LLM reasoning incur costs.

Metric	Value
Total merges	5
Total cost	$0.06
Total time	2.1 minutes

Add FutureSearch to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

Tell Claude to run each experiment with inline-generated data:

Create a test dataset of 10 companies with exact names, then merge them.
Then create a version with typos and merge again. Then test semantic
matching (Instagram to Meta, YouTube to Alphabet). Then test pharma
subsidiaries (Genentech to Roche, MSD to Merck). Show costs for each.

Claude runs five merge experiments:

Tool: futuresearch_merge (Experiment 1: Exact matches)
→ 10/10 matched, 6s, $0.00

Tool: futuresearch_merge (Experiment 2: Fuzzy/typo matches)
→ 10/10 matched, 13s, $0.00

Tool: futuresearch_merge (Experiment 3: Semantic matches)
→ 10/10 matched, 62s, $0.05

Tool: futuresearch_merge (Experiment 4: Pharma subsidiaries)
→ 13/13 matched, 38s, $0.01

Tool: futuresearch_merge (Experiment 5: Email domain matching)
→ 5/5 matched, 9s, $0.00

Add the FutureSearch connector if you haven't already. Then upload your company tables and ask Claude:

Create a test dataset of 10 companies with exact names, then merge them. Then create a version with typos and merge again. Then test semantic matching (Instagram to Meta, YouTube to Alphabet). Show costs for each.

Exact and fuzzy matches are free. Only rows requiring LLM reasoning cost ~$0.002/row.

Go to futuresearch.ai/app, upload your company tables, and enter:

Merge the tables based on company name and ticker. Match companies to their stock tickers.

Exact and fuzzy matches are free. Only rows requiring LLM reasoning cost ~$0.002/row. Web search fallback costs ~$0.01/row.

The FutureSearch SDK implements a cost-optimized merge cascade. This example empirically measures the cost of each matching strategy across 5 experiments.

pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here  # Get one at futuresearch.ai/app/api-key

import asyncio
import pandas as pd
from futuresearch import create_session, get_billing_balance
from futuresearch.ops import merge

async def measure_merge(name, task, left_table, right_table, **kwargs):
    balance_before = await get_billing_balance()
    async with create_session(name=name) as session:
        result = await merge(
            task=task,
            session=session,
            left_table=left_table,
            right_table=right_table,
            **kwargs,
        )
    balance_after = await get_billing_balance()
    cost = balance_before.current_balance_dollars - balance_after.current_balance_dollars
    return result.data, cost

# Exact matches: $0.00
result, cost = await measure_merge(
    "Exact matches only",
    "Match companies by name.",
    companies_exact, revenue_exact,
    merge_on_left="company", merge_on_right="company_name",
)

# Semantic matches: ~$0.03
result, cost = await measure_merge(
    "Semantic matches",
    "Match companies. Instagram and WhatsApp are owned by Meta.",
    companies_semantic, revenue_exact,
    merge_on_left="company", merge_on_right="company_name",
)

Results

Experiment	Match Type	Cost	Accuracy
Exact strings	Exact only	$0.00	100%
Typos/case	Exact + Fuzzy	$0.00	100%
Semantic (Instagram→Meta)	Exact + LLM	$0.05	100%
Pharma (Genentech→Roche)	Exact + Fuzzy + LLM	$0.01	100%
Email domains	LLM (domain)	$0.00	100%

The cascade strategy:

Strategy	Cost	Example
Exact match	Free	"Apple Inc" to "Apple Inc"
Fuzzy match	Free	"Microsft" to "Microsoft"
LLM reasoning	~$0.002/row	"Instagram" to "Meta Platforms"
Web search	~$0.01/row	Obscure or stale data

Key finding: exact and fuzzy matches are free. Only rows requiring LLM reasoning incur costs. Providing merge_on hints reduces costs by helping the cascade skip LLM reasoning for more rows.

Merge Costs and Speed

Run 5 merge experiments to empirically measure the cost cascade across increasing match difficulty. Exact and fuzzy matches are free; only semantic matches that require LLM reasoning incur costs.

Metric

Value

Total merges

Total cost

$0.06

Total time

2.1 minutes

Add FutureSearch to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

Tell Claude to run each experiment with inline-generated data:

Create a test dataset of 10 companies with exact names, then merge them.
Then create a version with typos and merge again. Then test semantic
matching (Instagram to Meta, YouTube to Alphabet). Then test pharma
subsidiaries (Genentech to Roche, MSD to Merck). Show costs for each.

Claude runs five merge experiments:

Tool: futuresearch_merge (Experiment 1: Exact matches)
→ 10/10 matched, 6s, $0.00

Tool: futuresearch_merge (Experiment 2: Fuzzy/typo matches)
→ 10/10 matched, 13s, $0.00

Tool: futuresearch_merge (Experiment 3: Semantic matches)
→ 10/10 matched, 62s, $0.05

Tool: futuresearch_merge (Experiment 4: Pharma subsidiaries)
→ 13/13 matched, 38s, $0.01

Tool: futuresearch_merge (Experiment 5: Email domain matching)
→ 5/5 matched, 9s, $0.00

Add the FutureSearch connector if you haven't already. Then upload your company tables and ask Claude:

Create a test dataset of 10 companies with exact names, then merge them. Then create a version with typos and merge again. Then test semantic matching (Instagram to Meta, YouTube to Alphabet). Show costs for each.

Exact and fuzzy matches are free. Only rows requiring LLM reasoning cost ~$0.002/row.

Go to futuresearch.ai/app, upload your company tables, and enter:

Merge the tables based on company name and ticker. Match companies to their stock tickers.

Exact and fuzzy matches are free. Only rows requiring LLM reasoning cost ~$0.002/row. Web search fallback costs ~$0.01/row.

The FutureSearch SDK implements a cost-optimized merge cascade. This example empirically measures the cost of each matching strategy across 5 experiments.

pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here  # Get one at futuresearch.ai/app/api-key

import asyncio
import pandas as pd
from futuresearch import create_session, get_billing_balance
from futuresearch.ops import merge

async def measure_merge(name, task, left_table, right_table, **kwargs):
    balance_before = await get_billing_balance()
    async with create_session(name=name) as session:
        result = await merge(
            task=task,
            session=session,
            left_table=left_table,
            right_table=right_table,
            **kwargs,
        )
    balance_after = await get_billing_balance()
    cost = balance_before.current_balance_dollars - balance_after.current_balance_dollars
    return result.data, cost

# Exact matches: $0.00
result, cost = await measure_merge(
    "Exact matches only",
    "Match companies by name.",
    companies_exact, revenue_exact,
    merge_on_left="company", merge_on_right="company_name",
)

# Semantic matches: ~$0.03
result, cost = await measure_merge(
    "Semantic matches",
    "Match companies. Instagram and WhatsApp are owned by Meta.",
    companies_semantic, revenue_exact,
    merge_on_left="company", merge_on_right="company_name",
)

Results

Experiment

Match Type

Cost

Accuracy

Exact strings

Exact only

$0.00

100%

Typos/case

Exact + Fuzzy

$0.00

100%

Semantic (Instagram→Meta)

Exact + LLM

$0.05

100%

Pharma (Genentech→Roche)

Exact + Fuzzy + LLM

$0.01

100%

Email domains

LLM (domain)

$0.00

100%

The cascade strategy:

Strategy

Cost

Example

Exact match

Free

"Apple Inc" to "Apple Inc"

Fuzzy match

Free

"Microsft" to "Microsoft"

LLM reasoning

~$0.002/row

"Instagram" to "Meta Platforms"

Web search

~$0.01/row

Obscure or stale data

Key finding: exact and fuzzy matches are free. Only rows requiring LLM reasoning incur costs. Providing merge_on hints reduces costs by helping the cascade skip LLM reasoning for more rows.