FutureSearch Logofuturesearch
  • Blog
  • Solutions
  • Research
  • Docs
  • Evals
  • Company
  • Get Researchers
FutureSearch Logo

General inquiry? You can reach us at hello@futuresearch.ai.

Company

Team & CareersPressPrivacy PolicyTerms of Service

Developers

SDK DocsAPI ReferenceCase StudiesGitHub

Follow Us

X (Twitter)@dschwarz26LinkedIn
FutureSearchdocs
Your research team
Installation
  • All install methods
  • Claude.ai
  • Claude Cowork
  • Claude Code
  • Web App
  • Python SDK
  • Skill
  • MCP Server
Reference
  • API Key
  • classify
  • dedupe
  • forecast
  • merge
  • rank
  • agent_map
  • screen
  • Progress Monitoring
  • Chaining Operations
Guides
  • LLM-Powered Data Labeling
  • Add a Column via Web Research
  • Classify and Label Rows
  • Deduplicate Training Data
  • Filter a Dataset Intelligently
  • Join Tables Without Shared Keys
  • Rank Data by External Metrics
  • Resolve Duplicate Entities
  • Scale Deduplication to 20K Rows
Case Studies
  • Deduplicate Contact Lists
  • Deduplicate CRM Records
  • Enrich Contacts with Company Data
  • Fuzzy Match Across Tables
  • Link Records Across Medical Datasets
  • LLM Cost vs. Accuracy
  • Merge Costs and Speed
  • Merge Thousands of Records
  • Multi-Stage Lead Qualification
  • Research and Rank Web Data
  • Run 10,000 LLM Web Research Agents
  • Score Cold Leads via Web Research
  • Score Leads from Fragmented Data
  • Screen 10,000 Rows
  • Screen Job Listings
  • Screen Stocks by Economic Sensitivity
  • Screen Stocks by Investment Thesis
FutureSearchby futuresearch
by futuresearch

Merge Costs and Speed

Go to futuresearch.ai/app, upload your company tables, and enter:

Merge the tables based on company name and ticker. Match companies to their stock tickers.

Exact and fuzzy matches are free. Only rows requiring LLM reasoning cost ~$0.002/row. Web search fallback costs ~$0.01/row.

Add the everyrow connector if you haven't already. Then upload your company tables and ask Claude:

Create a test dataset of 10 companies with exact names, then merge them. Then create a version with typos and merge again. Then test semantic matching (Instagram to Meta, YouTube to Alphabet). Show costs for each.

Exact and fuzzy matches are free. Only rows requiring LLM reasoning cost ~$0.002/row. Web search fallback costs ~$0.01/row.

Claude Code is great at merging two tables. But how much does it cost, and what determines the price? The answer depends on how hard each match is: exact and fuzzy matches are free, and only semantic matches that require LLM reasoning incur costs.

Here, we run 5 merge experiments to empirically measure the cost cascade across increasing match difficulty.

MetricValue
Total merges5
Total cost$0.06
Total time2.1 minutes

Add everyrow to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

Tell Claude to run each experiment with inline-generated data:

Create a test dataset of 10 companies with exact names, then merge them.
Then create a version with typos and merge again. Then test semantic
matching (Instagram to Meta, YouTube to Alphabet). Then test pharma
subsidiaries (Genentech to Roche, MSD to Merck). Show costs for each.

Results across all 5 experiments:

Tool: everyrow_merge (Experiment 1: Exact matches)
→ 10/10 matched, 6s, $0.00

Tool: everyrow_merge (Experiment 2: Fuzzy/typo matches)
→ 10/10 matched, 13s, $0.00

Tool: everyrow_merge (Experiment 3: Semantic matches)
→ 10/10 matched, 62s, $0.05

Tool: everyrow_merge (Experiment 4: Pharma subsidiaries)
→ 13/13 matched, 38s, $0.01

Tool: everyrow_merge (Experiment 5: Email domain matching)
→ 5/5 matched, 9s, $0.00
ExperimentMatch TypeCostAccuracy
Exact stringsExact only$0.00100%
Typos/caseExact + Fuzzy$0.00100%
Semantic (Instagram→Meta)Exact + LLM$0.05100%
Pharma (Genentech→Roche)Exact + Fuzzy + LLM$0.01100%
Email domainsLLM (domain)$0.00100%

The cascade strategy:

StrategyCostExample
Exact matchFree"Apple Inc" to "Apple Inc"
Fuzzy matchFree"Microsft" to "Microsoft"
LLM reasoning~$0.002/row"Instagram" to "Meta Platforms"
Web search~$0.01/rowObscure or stale data

The everyrow SDK implements a cost-optimized merge cascade. This notebook empirically measures the cost of each matching strategy across 5 experiments.

MetricValue
Total merges5
Total cost$0.03
pip install everyrow
export EVERYROW_API_KEY=your_key_here  # Get one at futuresearch.ai/api-key
import asyncio
import pandas as pd
from everyrow import create_session, get_billing_balance
from everyrow.ops import merge

async def measure_merge(name, task, left_table, right_table, **kwargs):
    balance_before = await get_billing_balance()
    async with create_session(name=name) as session:
        result = await merge(
            task=task,
            session=session,
            left_table=left_table,
            right_table=right_table,
            **kwargs,
        )
    balance_after = await get_billing_balance()
    cost = balance_before.current_balance_dollars - balance_after.current_balance_dollars
    return result.data, cost

# Exact matches: $0.00
result, cost = await measure_merge(
    "Exact matches only",
    "Match companies by name.",
    companies_exact, revenue_exact,
    merge_on_left="company", merge_on_right="company_name",
)

# Semantic matches: ~$0.03
result, cost = await measure_merge(
    "Semantic matches",
    "Match companies. Instagram and WhatsApp are owned by Meta.",
    companies_semantic, revenue_exact,
    merge_on_left="company", merge_on_right="company_name",
)
ExperimentCostAccuracy
Exact matches$0.00100%
Fuzzy (typos)$0.00100%
Semantic$0.03100%
Pharma$0.0061.5%

Key finding: exact and fuzzy matches are free. Only rows requiring LLM reasoning incur costs (~$0.002/row for semantic, ~$0.01/row for web search). Providing merge_on hints reduces costs by helping the cascade skip LLM reasoning for more rows.