FutureSearch Logofuturesearch
  • Blog
  • Solutions
  • Research
  • Docs
  • Evals
  • Company
  • Get Researchers
FutureSearch Logo

General inquiry? You can reach us at hello@futuresearch.ai.

Company

Team & CareersPressPrivacy PolicyTerms of Service

Developers

SDK DocsAPI ReferenceCase StudiesGitHub

Follow Us

X (Twitter)@dschwarz26LinkedIn
FutureSearchdocs
Your research team
Installation
  • All install methods
  • Claude.ai
  • Claude Cowork
  • Claude Code
  • Web App
  • Python SDK
  • Skill
  • MCP Server
Reference
  • API Key
  • classify
  • dedupe
  • forecast
  • merge
  • rank
  • agent_map
  • screen
  • Progress Monitoring
  • Chaining Operations
Guides
  • LLM-Powered Data Labeling
  • Add a Column via Web Research
  • Classify and Label Rows
  • Deduplicate Training Data
  • Filter a Dataset Intelligently
  • Join Tables Without Shared Keys
  • Rank Data by External Metrics
  • Resolve Duplicate Entities
  • Scale Deduplication to 20K Rows
Case Studies
  • Deduplicate Contact Lists
  • Deduplicate CRM Records
  • Enrich Contacts with Company Data
  • Fuzzy Match Across Tables
  • Link Records Across Medical Datasets
  • LLM Cost vs. Accuracy
  • Merge Costs and Speed
  • Merge Thousands of Records
  • Multi-Stage Lead Qualification
  • Research and Rank Web Data
  • Run 10,000 LLM Web Research Agents
  • Score Cold Leads via Web Research
  • Score Leads from Fragmented Data
  • Screen 10,000 Rows
  • Screen Job Listings
  • Screen Stocks by Economic Sensitivity
  • Screen Stocks by Investment Thesis
FutureSearchby futuresearch
by futuresearch

Classify

classify takes a DataFrame and a list of allowed categories, then assigns each row to exactly one category using web research that scales to the difficulty of the classification. Supports binary (yes/no) and multi-category classification with optional reasoning output.

Examples

GICS sector classification

from pandas import DataFrame
from everyrow.ops import classify

companies = DataFrame([
    {"company": "Apple"},
    {"company": "JPMorgan Chase"},
    {"company": "ExxonMobil"},
    {"company": "Pfizer"},
    {"company": "Procter & Gamble"},
    {"company": "Tesla"},
    {"company": "AT&T"},
    {"company": "Caterpillar"},
    {"company": "Duke Energy"},
    {"company": "Simon Property Group"},
])

result = await classify(
    task="Classify this company by its GICS industry sector",
    categories=[
        "Energy", "Materials", "Industrials", "Consumer Discretionary",
        "Consumer Staples", "Health Care", "Financials",
        "Information Technology", "Communication Services",
        "Utilities", "Real Estate",
    ],
    input=companies,
)
print(result.data[["company", "classification"]])

Output:

company classification
Apple Information Technology
JPMorgan Chase Financials
ExxonMobil Energy
Pfizer Health Care
Procter & Gamble Consumer Staples
Tesla Consumer Discretionary
AT&T Communication Services
Caterpillar Industrials
Duke Energy Utilities
Simon Property Group Real Estate

Binary classification

For yes/no questions, use two categories:

result = await classify(
    task="Is this company founder-led?",
    categories=["yes", "no"],
    input=companies,
)

Custom output column and reasoning

result = await classify(
    task="Classify each company by its primary industry sector",
    categories=["Technology", "Finance", "Healthcare", "Energy"],
    input=companies,
    classification_field="sector",
    include_reasoning=True,
)
print(result.data[["company", "sector", "reasoning"]])

Parameters

Name Type Default Description
task str required Natural-language instructions describing how to classify each row
categories list[str] required Allowed category values (minimum 2). Each row is assigned exactly one.
input DataFrame required Rows to classify
classification_field str "classification" Name of the output column for the assigned category
include_reasoning bool False If True, adds a reasoning column with the agent's justification
session Session Optional, auto-created if omitted

Output

One column is added to each input row (name controlled by classification_field):

Column Type Description
classification str One of the provided categories values
reasoning str Agent's justification (only if include_reasoning=True)

Via MCP

MCP tool: everyrow_classify

Parameter Type Description
task string Classification instructions
categories list[string] Allowed categories (minimum 2)
classification_field string Output column name (default: "classification")
include_reasoning boolean Include reasoning column (default: false)