FutureSearch Logofuturesearch
  • Blog
  • Solutions
  • Research
  • Docs
  • Evals
  • Company
  • Get Researchers
FutureSearch Logo

General inquiry? You can reach us at hello@futuresearch.ai.

Company

Team & CareersPressPrivacy PolicyTerms of Service

Developers

SDK DocsAPI ReferenceCase StudiesGitHub

Follow Us

X (Twitter)@dschwarz26LinkedIn
FutureSearchdocs
Your research team
Installation
  • All install methods
  • Claude.ai
  • Claude Cowork
  • Claude Code
  • Web App
  • Python SDK
  • Skill
  • MCP Server
Reference
  • API Key
  • classify
  • dedupe
  • forecast
  • merge
  • rank
  • agent_map
  • screen
  • Progress Monitoring
  • Chaining Operations
Guides
  • LLM-Powered Data Labeling
  • Add a Column via Web Research
  • Classify and Label Rows
  • Deduplicate Training Data
  • Filter a Dataset Intelligently
  • Join Tables Without Shared Keys
  • Rank Data by External Metrics
  • Resolve Duplicate Entities
  • Scale Deduplication to 20K Rows
Case Studies
  • Deduplicate Contact Lists
  • Deduplicate CRM Records
  • Enrich Contacts with Company Data
  • Fuzzy Match Across Tables
  • Link Records Across Medical Datasets
  • LLM Cost vs. Accuracy
  • Merge Costs and Speed
  • Merge Thousands of Records
  • Multi-Stage Lead Qualification
  • Research and Rank Web Data
  • Run 10,000 LLM Web Research Agents
  • Score Cold Leads via Web Research
  • Score Leads from Fragmented Data
  • Screen 10,000 Rows
  • Screen Job Listings
  • Screen Stocks by Economic Sensitivity
  • Screen Stocks by Investment Thesis
FutureSearchby futuresearch
by futuresearch

Python SDK

Just want to use everyrow? Go to futuresearch.ai/app, add it to Claude.ai, Cowork, or Claude Code. This guide is for developers using the Python SDK.

Using the Python SDK gives you direct access to the utilities for directing your team of researchers. You can use all the methods documented in the API Reference and control the parameters such as effort level, which LLM to use, etc.

Python SDK with pip

pip install everyrow

Requires Python 3.12+.

Important: be sure to supply your API key when running scripts:

export EVERYROW_API_KEY=sk-cho...
python3 example_script.py

Quick example:

import asyncio
import pandas as pd
from everyrow.ops import screen
from pydantic import BaseModel, Field

companies = pd.DataFrame([
    {"company": "Airtable",}, {"company": "Vercel",}, {"company": "Notion",}
])

class JobScreenResult(BaseModel):
    qualifies: bool = Field(description="True if company lists jobs with all criteria")

async def main():
    result = await screen(
        task="""Qualifies if: 1. Remote-friendly, 2. Senior, and 3. Discloses salary""",
        input=companies,
        response_model=JobScreenResult,
    )
    print(result.data.head())

asyncio.run(main())

Dependencies

The MCP server requires uv (if using uvx) or pip (if installed directly). The Python SDK requires Python 3.12+.

Sessions

Every operation runs within a session. Sessions group related operations together and appear in your everyrow.io session list.

When you call an operation without an explicit session, one is created automatically. For multiple related operations, create an explicit session:

from everyrow import create_session
from everyrow.ops import screen, rank

async with create_session(name="Lead Qualification") as session:
    # Get the URL to view this session in the dashboard
    print(f"View at: {session.get_url()}")

    # All operations share this session
    screened = await screen(
        session=session,
        task="Has a company email domain (not gmail, yahoo, etc.)",
        input=leads,
        response_model=ScreenResult,
    )

    ranked = await rank(
        session=session,
        task="Score by likelihood to convert",
        input=screened.data,
        field_name="conversion_score",
    )

The session URL lets you monitor progress and inspect results in the web UI while your script runs.

Listing Sessions

Retrieve all your sessions programmatically with list_sessions:

from everyrow import list_sessions

sessions = await list_sessions()
for s in sessions:
    print(f"{s.name} ({s.session_id}) — created {s.created_at:%Y-%m-%d}")
    print(f"  View: {s.get_url()}")

Each item is a SessionInfo with session_id, name, created_at, and updated_at fields.

Async Operations

For long-running jobs, use the _async variants to submit work and continue without blocking:

from everyrow import create_session
from everyrow.ops import rank_async

async with create_session(name="Background Ranking") as session:
    task = await rank_async(
        session=session,
        task="Score by revenue potential",
        input=large_dataframe,
        field_name="score",
    )

    # Task is now running server-side
    print(f"Task ID: {task.task_id}")

    # Do other work...

    # Wait for result when ready
    result = await task.await_result()

    # Or cancel if no longer needed
    await task.cancel()

Print the task ID. If your script crashes, recover the result later:

from everyrow import fetch_task_data

df = await fetch_task_data("12345678-1234-1234-1234-123456789abc")

Operations

Operation Description
Classify Categorize rows into predefined classes
Screen Filter rows by criteria requiring judgment
Rank Score rows by qualitative factors
Dedupe Deduplicate when fuzzy matching fails
Merge Join tables when keys don't match exactly
Forecast Predict probabilities for binary questions
Research Run web agents to research each row

See Also

  • Guides: step-by-step tutorials
  • Case Studies: worked examples
  • Skills vs MCP: integration options