FutureSearch Logofuturesearch
  • Blog
  • Solutions
  • Research
  • Docs
  • Evals
  • Company
  • Get Researchers
FutureSearch Logo

General inquiry? You can reach us at hello@futuresearch.ai.

Company

Team & CareersPressPrivacy PolicyTerms of Service

Developers

SDK DocsAPI ReferenceCase StudiesGitHub

Follow Us

X (Twitter)@dschwarz26LinkedIn
FutureSearchdocs
Your research team
Installation
  • All install methods
  • Claude.ai
  • Claude Cowork
  • Claude Code
  • Web App
  • Python SDK
  • Skill
  • MCP Server
Reference
  • API Key
  • classify
  • dedupe
  • forecast
  • merge
  • rank
  • agent_map
  • screen
  • Progress Monitoring
  • Chaining Operations
Guides
  • LLM-Powered Data Labeling
  • Add a Column via Web Research
  • Classify and Label Rows
  • Deduplicate Training Data
  • Filter a Dataset Intelligently
  • Join Tables Without Shared Keys
  • Rank Data by External Metrics
  • Resolve Duplicate Entities
  • Scale Deduplication to 20K Rows
Case Studies
  • Deduplicate Contact Lists
  • Deduplicate CRM Records
  • Enrich Contacts with Company Data
  • Fuzzy Match Across Tables
  • Link Records Across Medical Datasets
  • LLM Cost vs. Accuracy
  • Merge Costs and Speed
  • Merge Thousands of Records
  • Multi-Stage Lead Qualification
  • Research and Rank Web Data
  • Run 10,000 LLM Web Research Agents
  • Score Cold Leads via Web Research
  • Score Leads from Fragmented Data
  • Screen 10,000 Rows
  • Screen Job Listings
  • Screen Stocks by Economic Sensitivity
  • Screen Stocks by Investment Thesis
FutureSearchby futuresearch
by futuresearch

Join Tables Without Shared Keys

Go to futuresearch.ai/app, upload company_info.csv and valuations.csv, and enter:

Merge company_info.csv with valuations.csv. The first table has company names, the second has stock tickers. Match companies to their stock tickers.

437 of 438 companies matched (99.8%). Results take about 11 minutes.

Add the everyrow connector if you haven't already. Then upload company_info.csv and valuations.csv and ask Claude:

Merge company_info.csv with valuations.csv. The first table has company names, the second has stock tickers. Match companies to their stock tickers.

437 of 438 companies matched (99.8%). Results take about 11 minutes.

Claude Code can merge two CSVs when they share a key column. But what if one table has company names and the other has stock tickers, and there's no shared column to join on?

Here, we get Claude Code to join two S&P 500 tables (438 rows) where one has company names and the other has stock tickers. No merge key exists.

MetricValue
Rows processed438
Matched437 (99.8%)
Cost$0.82
Time~10.9 minutes

Add everyrow to Claude Code if you haven't already:

claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp

Download company_info.csv (company names, price, market cap) and valuations.csv (tickers, fair value). With both files in your working directory, tell Claude:

Merge company_info.csv with valuations.csv. The first table has company names,
the second has stock tickers. Match companies to their stock tickers.

Claude calls everyrow's merge MCP tool to match the tables using semantic understanding:

Tool: everyrow_merge
├─ task: "Match companies to their stock tickers"
├─ left_csv: "/Users/you/company_info.csv"
└─ right_csv: "/Users/you/valuations.csv"

→ Submitted: 438 rows for merging.
  Session: https://futuresearch.ai/sessions/5c33bade-f142-44b0-bed5-9e5a846cdc3c
  Task ID: 5c33...

Tool: everyrow_progress
├─ task_id: "5c33..."
→ Running: 0/438 complete (18s elapsed)

...

Tool: everyrow_progress
→ Completed: 438/438 (0 failed) in 653s.

Tool: everyrow_results
├─ task_id: "5c33..."
├─ output_path: "/Users/you/merged_companies.csv"
→ Saved 438 rows to /Users/you/merged_companies.csv

437 of 438 companies matched. View the session.

CompanyTickerFair Value
3MMMM39.18
Abbott LaboratoriesABT119.19
AbbVieABBV180.95
AccentureACN107.79
A. O. SmithAOS6.59

The system figures out that "3M" corresponds to ticker "MMM", "Alphabet Inc." to "GOOGL", and so on. The single unmatched row was Block, Inc. (formerly Square, Inc.), where the 2021 rebrand made the name-to-ticker match difficult.

When you need to join two pandas DataFrames but there's no shared ID column, pd.merge() won't help. The everyrow SDK uses LLM-powered semantic matching to join tables even when names are spelled differently across sources.

In this example, we join two S&P 500 tables (438 rows) where one has company names and the other has stock tickers.

MetricValue
Rows processed438
Accuracy100%
Cost$1.00
Time~30 seconds
pip install everyrow
export EVERYROW_API_KEY=your_key_here  # Get one at futuresearch.ai

Download company_info.csv and valuations.csv, or run the full notebook.

import asyncio
import pandas as pd
from everyrow.ops import merge

# Two tables from different sources - no shared column
companies = pd.read_csv("company_info.csv")    # has: company, price, mkt_cap, shares
valuations = pd.read_csv("valuations.csv")      # has: ticker, fair_value

async def main():
    result = await merge(
        task="Match companies to their stock tickers",
        left_table=companies,
        right_table=valuations,
    )

    # The result is a DataFrame with all columns joined
    print(result.data.head())

    #                company   price      mkt_cap     shares  ticker  fair_value
    # 0                   3M  101.74  61.70678828  606514530     MMM       39.18
    # 1          A. O. Smith   32.38  4.904416495  151464376     AOS        6.59
    # 2  Abbott Laboratories   34.87  51.22933139 1469152033     ABT      119.19

asyncio.run(main())

The SDK figures out that "3M" corresponds to ticker "MMM", "Alphabet Inc." to "GOOGL", and so on. No merge columns are specified because there's nothing to match on directly.

The merge operation uses a cascade of matching strategies, stopping at the simplest one that works for each row:

StrategyWhen UsedCost
Exact matchIdentical stringsFree
Fuzzy matchTypos, case differencesFree
LLM matchSemantic equivalence (company → ticker)~$0.002/row
Web searchStale or obscure data~$0.01/row

For the company-to-ticker merge above, 99.8% of rows matched via LLM reasoning alone. The remaining 0.2% required a quick web lookup.

This approach works well when your tables represent the same entities but use different identifiers: company names vs tickers, product names vs SKUs, subsidiary names vs parent companies. For tables that do share a common column, the SDK will use exact matching first and only escalate to more expensive methods when needed.


Built with everyrow. See the full analysis with multiple experiments in the merge tutorial notebook.