Merge Costs and Speed
Go to futuresearch.ai/app, upload your company tables, and enter:
Merge the tables based on company name and ticker. Match companies to their stock tickers.
Exact and fuzzy matches are free. Only rows requiring LLM reasoning cost ~$0.002/row. Web search fallback costs ~$0.01/row.
Add the everyrow connector if you haven't already. Then upload your company tables and ask Claude:
Create a test dataset of 10 companies with exact names, then merge them. Then create a version with typos and merge again. Then test semantic matching (Instagram to Meta, YouTube to Alphabet). Show costs for each.
Exact and fuzzy matches are free. Only rows requiring LLM reasoning cost ~$0.002/row. Web search fallback costs ~$0.01/row.
Claude Code is great at merging two tables. But how much does it cost, and what determines the price? The answer depends on how hard each match is: exact and fuzzy matches are free, and only semantic matches that require LLM reasoning incur costs.
Here, we run 5 merge experiments to empirically measure the cost cascade across increasing match difficulty.
| Metric | Value |
|---|---|
| Total merges | 5 |
| Total cost | $0.06 |
| Total time | 2.1 minutes |
Add everyrow to Claude Code if you haven't already:
claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp
Tell Claude to run each experiment with inline-generated data:
Create a test dataset of 10 companies with exact names, then merge them.
Then create a version with typos and merge again. Then test semantic
matching (Instagram to Meta, YouTube to Alphabet). Then test pharma
subsidiaries (Genentech to Roche, MSD to Merck). Show costs for each.
Results across all 5 experiments:
Tool: everyrow_merge (Experiment 1: Exact matches)
→ 10/10 matched, 6s, $0.00
Tool: everyrow_merge (Experiment 2: Fuzzy/typo matches)
→ 10/10 matched, 13s, $0.00
Tool: everyrow_merge (Experiment 3: Semantic matches)
→ 10/10 matched, 62s, $0.05
Tool: everyrow_merge (Experiment 4: Pharma subsidiaries)
→ 13/13 matched, 38s, $0.01
Tool: everyrow_merge (Experiment 5: Email domain matching)
→ 5/5 matched, 9s, $0.00
| Experiment | Match Type | Cost | Accuracy |
|---|---|---|---|
| Exact strings | Exact only | $0.00 | 100% |
| Typos/case | Exact + Fuzzy | $0.00 | 100% |
| Semantic (Instagram→Meta) | Exact + LLM | $0.05 | 100% |
| Pharma (Genentech→Roche) | Exact + Fuzzy + LLM | $0.01 | 100% |
| Email domains | LLM (domain) | $0.00 | 100% |
The cascade strategy:
| Strategy | Cost | Example |
|---|---|---|
| Exact match | Free | "Apple Inc" to "Apple Inc" |
| Fuzzy match | Free | "Microsft" to "Microsoft" |
| LLM reasoning | ~$0.002/row | "Instagram" to "Meta Platforms" |
| Web search | ~$0.01/row | Obscure or stale data |
The everyrow SDK implements a cost-optimized merge cascade. This notebook empirically measures the cost of each matching strategy across 5 experiments.
| Metric | Value |
|---|---|
| Total merges | 5 |
| Total cost | $0.03 |
pip install everyrow
export EVERYROW_API_KEY=your_key_here # Get one at futuresearch.ai/api-key
import asyncio
import pandas as pd
from everyrow import create_session, get_billing_balance
from everyrow.ops import merge
async def measure_merge(name, task, left_table, right_table, **kwargs):
balance_before = await get_billing_balance()
async with create_session(name=name) as session:
result = await merge(
task=task,
session=session,
left_table=left_table,
right_table=right_table,
**kwargs,
)
balance_after = await get_billing_balance()
cost = balance_before.current_balance_dollars - balance_after.current_balance_dollars
return result.data, cost
# Exact matches: $0.00
result, cost = await measure_merge(
"Exact matches only",
"Match companies by name.",
companies_exact, revenue_exact,
merge_on_left="company", merge_on_right="company_name",
)
# Semantic matches: ~$0.03
result, cost = await measure_merge(
"Semantic matches",
"Match companies. Instagram and WhatsApp are owned by Meta.",
companies_semantic, revenue_exact,
merge_on_left="company", merge_on_right="company_name",
)
| Experiment | Cost | Accuracy |
|---|---|---|
| Exact matches | $0.00 | 100% |
| Fuzzy (typos) | $0.00 | 100% |
| Semantic | $0.03 | 100% |
| Pharma | $0.00 | 61.5% |
Key finding: exact and fuzzy matches are free. Only rows requiring LLM reasoning incur costs (~$0.002/row for semantic, ~$0.01/row for web search). Providing merge_on hints reduces costs by helping the cascade skip LLM reasoning for more rows.