Deduplicate Contact Lists
Identifying the same person across two contact lists requires semantic matching that understands "Dr. Sarah Chen" is "S. Chen" and "Robert Johnson" is "Bob Johnson." This case study merges two overlapping contact lists to find duplicates across name formats.
| Metric | Value |
|---|---|
| Left list | 12 contacts |
| Right list | 10 contacts |
| Matched pairs | 7 |
| Total cost | $0.00 |
| Time | 128 seconds |
Add FutureSearch to Claude Code if you haven't already:
claude mcp add futuresearch --scope project --transport http https://mcp.futuresearch.ai/mcp
With both contact CSVs in your working directory, tell Claude:
Merge these two contact lists to find the same person across both. Account for
nicknames (Bob/Robert, Mike/Michael, Tom/Thomas), initials (S. Chen = Sarah Chen),
and institution matching. When in doubt, favor false negatives over false positives.
Claude calls FutureSearch's merge MCP tool:
Tool: futuresearch_merge
├─ task: "Match contacts between two lists to identify the same person..."
├─ left_csv: "/Users/you/contacts_list_a.csv"
├─ right_csv: "/Users/you/contacts_list_b.csv"
├─ merge_on_left: "name"
├─ merge_on_right: "full_name"
└─ relationship_type: "one_to_one"
→ Submitted: 12 rows for merging.
Session: https://futuresearch.ai/sessions/1d39b32d-d71e-48e8-8ac7-de907f86745a
Tool: futuresearch_results
→ Saved 12 rows to /Users/you/merged_contacts.csv
7 matches found, 5 correctly left unmatched. View the session.
Add the FutureSearch connector if you haven't already. Then upload both contact CSVs and ask Claude:
Merge these two contact lists to find the same person across both. Account for nicknames (Bob/Robert, Mike/Michael, Tom/Thomas), initials (S. Chen = Sarah Chen), and institution matching. When in doubt, favor false negatives over false positives.
Go to futuresearch.ai/app, upload both contact CSVs, and enter:
Merge these two contact lists to find the same person across both. Account for nicknames (Bob/Robert, Mike/Michael, Tom/Thomas), initials (S. Chen = Sarah Chen), and institution matching. When in doubt, favor false negatives over false positives.
pip install futuresearch
export FUTURESEARCH_API_KEY=your_key_here # Get one at futuresearch.ai/app/api-key
import asyncio
import pandas as pd
from futuresearch import create_session
from futuresearch.ops import merge
list_a = pd.read_csv("contacts_list_a.csv")
list_b = pd.read_csv("contacts_list_b.csv")
async def main():
async with create_session(name="Contact List Merge") as session:
result = await merge(
session=session,
task="""
Match contacts between two lists to identify the same person.
Account for nicknames (Bob/Robert, Mike/Michael, Tom/Thomas),
initials (S. Chen = Sarah Chen), and institution matching.
Favor false negatives over false positives.
""",
left_table=list_a,
right_table=list_b,
merge_on_left="name",
merge_on_right="full_name",
)
return result.data
merged = asyncio.run(main())
Results
| List A | List B | Match Type |
|---|---|---|
| Dr. Sarah Chen | S. Chen | Initial + institution |
| Michael O'Brien | Mike O'Brien | Nickname |
| James Wilson | James R. Wilson | Middle initial |
| Robert Johnson | Bob Johnson | Nickname |
| Thomas Lee | Tom Lee | Nickname |
| Priya Sharma | Priya S. | Initial |
| Elena Rodriguez | Elena R. | Initial |
David Kim, Anna Kowalski, Maria Santos, Jennifer Park, and Christopher Davis were correctly left unmatched (no counterpart in the other list). The merge handled all variations via fuzzy matching without needing LLM calls.