FutureSearch Logofuturesearch
  • Solutions
  • Pricing
  • Research
  • Docs
  • Evals
  • Blog
  • Company
  • LiteLLM Checker
  • Get Researchers
FutureSearch Logo

General inquiry? You can reach us at hello@futuresearch.ai.

Company

Team & CareersPressPrivacy PolicyTerms of Service

Developers

SDK DocsAPI ReferenceCase StudiesGitHubSupport

Integrations

Claude CodeCursorChatGPT CodexClaude.ai

Follow Us

X (Twitter)@dschwarz26LinkedIn
FutureSearchdocs
Your research team
Installation
  • All install methods
  • Claude.ai
  • Claude Cowork
  • Claude Code
  • Web App
  • Python SDK
  • Skill
  • MCP Server
Reference
  • API Key
  • classify
  • dedupe
  • forecast
  • merge
  • rank
  • agent_map
  • Progress Monitoring
  • Chaining Operations
Guides
  • LLM-Powered Data Labeling
  • Add a Column via Web Research
  • Classify and Label Rows
  • Deduplicate Training Data
  • Filter a Dataset Intelligently
  • Find Profitable Polymarket Trades
  • Forecast Outcomes for a List of Entities
  • Value a Private Company
  • Join Tables Without Shared Keys
  • Rank Data by External Metrics
  • Resolve Duplicate Entities
  • Scale Deduplication to 20K Rows
  • Turn Claude into an Accurate Forecaster
Case Studies
  • Deduplicate Contact Lists
  • Deduplicate CRM Records
  • Enrich Contacts with Company Data
  • Fuzzy Match Across Tables
  • Link Records Across Medical Datasets
  • LLM Cost vs. Accuracy
  • Merge Costs and Speed
  • Merge Thousands of Records
  • Multi-Stage Lead Qualification
  • Research and Rank Web Data
  • Run 10,000 LLM Web Research Agents
  • Score Cold Leads via Web Research
  • Score Leads from Fragmented Data
  • Screen 10,000 Rows
  • Screen Job Listings
  • Screen Stocks by Economic Sensitivity
  • Screen Stocks by Investment Thesis
FutureSearchby futuresearch
by futuresearch

Join Tables Without Shared Keys

Joining two tables when there is no shared column to merge on. Here, two S&P 500 tables (438 rows) are matched -- one has company names, the other has stock tickers. No merge key exists between them.

MetricValue
Rows processed438
Matched437 (99.8%)
Cost~$0.82-1.00
Time~10-30 seconds (SDK) to ~11 minutes (app)

Go to futuresearch.ai/app, upload company_info.csv and valuations.csv, and enter:

Merge company_info.csv with valuations.csv. The first table has company names, the second has stock tickers. Match companies to their stock tickers.

437 of 438 companies matched (99.8%). Results take about 11 minutes.

Add the FutureSearch connector if you haven't already. Then upload company_info.csv and valuations.csv and ask Claude:

Merge company_info.csv with valuations.csv. The first table has company names, the second has stock tickers. Match companies to their stock tickers.

437 of 438 companies matched (99.8%). Results take about 11 minutes.

Add FutureSearch to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

Download company_info.csv (company names, price, market cap) and valuations.csv (tickers, fair value). With both files in your working directory, tell Claude:

Merge company_info.csv with valuations.csv. The first table has company names,
the second has stock tickers. Match companies to their stock tickers.

Claude calls FutureSearch's merge MCP tool to match the tables using semantic understanding:

Tool: futuresearch_merge
├─ task: "Match companies to their stock tickers"
├─ left_csv: "/Users/you/company_info.csv"
└─ right_csv: "/Users/you/valuations.csv"

→ Submitted: 438 rows for merging.
  Session: https://futuresearch.ai/sessions/5c33bade-f142-44b0-bed5-9e5a846cdc3c
  Task ID: 5c33...

Tool: futuresearch_progress
├─ task_id: "5c33..."
→ Running: 0/438 complete (18s elapsed)

...

Tool: futuresearch_progress
→ Completed: 438/438 (0 failed) in 653s.

Tool: futuresearch_results
├─ task_id: "5c33..."
├─ output_path: "/Users/you/merged_companies.csv"
→ Saved 438 rows to /Users/you/merged_companies.csv

437 of 438 companies matched. View the session.

pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here  # Get one at futuresearch.ai

Download company_info.csv and valuations.csv, or run the full notebook.

import asyncio
import pandas as pd
from futuresearch.ops import merge

# Two tables from different sources - no shared column
companies = pd.read_csv("company_info.csv")    # has: company, price, mkt_cap, shares
valuations = pd.read_csv("valuations.csv")      # has: ticker, fair_value

async def main():
    result = await merge(
        task="Match companies to their stock tickers",
        left_table=companies,
        right_table=valuations,
    )

    # The result is a DataFrame with all columns joined
    print(result.data.head())

    #                company   price      mkt_cap     shares  ticker  fair_value
    # 0                   3M  101.74  61.70678828  606514530     MMM       39.18
    # 1          A. O. Smith   32.38  4.904416495  151464376     AOS        6.59
    # 2  Abbott Laboratories   34.87  51.22933139 1469152033     ABT      119.19

asyncio.run(main())

Results

CompanyTickerFair Value
3MMMM39.18
Abbott LaboratoriesABT119.19
AbbVieABBV180.95
AccentureACN107.79
A. O. SmithAOS6.59

The system figures out that "3M" corresponds to ticker "MMM", "Alphabet Inc." to "GOOGL", and so on. No merge columns are specified because there's nothing to match on directly. The single unmatched row was Block, Inc. (formerly Square, Inc.), where the 2021 rebrand made the name-to-ticker match difficult.

The merge operation uses a cascade of matching strategies, stopping at the simplest one that works for each row:

StrategyWhen UsedCost
Exact matchIdentical stringsFree
Fuzzy matchTypos, case differencesFree
LLM matchSemantic equivalence (company → ticker)~$0.002/row
Web searchStale or obscure data~$0.01/row

For the company-to-ticker merge above, 99.8% of rows matched via LLM reasoning alone. The remaining 0.2% required a quick web lookup.

This approach works well when your tables represent the same entities but use different identifiers: company names vs tickers, product names vs SKUs, subsidiary names vs parent companies. For tables that do share a common column, the SDK will use exact matching first and only escalate to more expensive methods when needed.


Built with FutureSearch. See the full analysis with multiple experiments in the merge tutorial notebook.