FutureSearch Logofuturesearch
  • Solutions
  • Pricing
  • Research
  • Docs
  • Evals
  • Blog
  • Company
  • Try it for free
FutureSearch Logo

General inquiry? You can reach us at hello@futuresearch.ai.

Company

Team & CareersPressPrivacy PolicyTerms of Service

Developers

SDK DocsAPI ReferenceCase StudiesGitHubSupport

Integrations

Claude CodeCursorChatGPT CodexClaude.ai

Follow Us

X (Twitter)@dschwarz26LinkedIn
FutureSearchdocs
Your research team
Installation
  • All install methods
  • Claude.ai
  • Claude Code
  • Web App
  • Python SDK
  • Skill
Reference
  • API Key
  • classify
  • dedupe
  • forecast
  • merge
  • rank
  • agent_map
  • MCP Server
  • Progress Monitoring
  • Chaining Operations
Guides
  • LLM-Powered Data Labeling
  • Add a Column via Web Research
  • Classify and Label Rows
  • Deduplicate Training Data
  • Error Handling
  • Filter a Dataset Intelligently
  • Find Profitable Prediction Market Trades
  • Forecast Outcomes for a List of Entities
  • Value a Private Company
  • Join Tables Without Shared Keys
  • Rank Data by External Metrics
  • Resolve Duplicate Entities
  • Scale Deduplication to 20K Rows
  • Turn Claude into an Accurate Forecaster
Case Studies
  • Deduplicate Contact Lists
  • Deduplicate CRM Records
  • Enrich Contacts with Company Data
  • Forecast a Sum-of-the-Parts SpaceX IPO Valuation
  • Forecast Anthropic and OpenAI IPO Valuations
  • Forecast Founder Seed Valuations for AI Researchers
  • Forecast When Anthropic and OpenAI Will IPO
  • Fuzzy Match Across Tables
  • Link Records Across Medical Datasets
  • LLM Cost vs. Accuracy
  • Merge Costs and Speed
  • Merge Thousands of Records
  • Multi-Stage Lead Qualification
  • Research and Rank Web Data
  • Run 10,000 LLM Web Research Agents
  • Score Cold Leads via Web Research
  • Score Leads from Fragmented Data
  • Screen 10,000 Rows
  • Screen Job Listings
  • Screen Stocks by Economic Sensitivity
  • Screen Stocks by Investment Thesis
FutureSearchby futuresearch
by futuresearch

MCP Server Reference

Quick setup: Users should use the hosted remote server. Add it to Claude.ai or Claude Code. The reference below describes the hosted server.

The FutureSearch MCP server is called directly by Claude Code, Claude.ai, and other MCP clients, no Python code needed. Operation tools follow an async pattern: submit a task, poll for progress, retrieve results. This lets long-running operations (1–10+ minutes) work reliably across MCP clients.

Data ingestion

futuresearch_upload_data

Upload a CSV from a URL (or local path in stdio mode).

ParameterTypeRequiredDescription
sourcestringYesHTTP(S) URL (Google Sheets and Drive URLs are auto-converted to CSV) or absolute local file path (stdio mode only).
session_idstringNoAdd to an existing session.
session_namestringNoName for the new (or renamed) session.

Returns artifact_id, session_id, row count, and column names.

In HTTP mode, local file paths are rejected — use futuresearch_request_upload_url instead.

futuresearch_request_upload_url

Request a presigned URL for uploading a local CSV (HTTP mode only). This bypasses the conversation context, so the file contents don't consume tokens.

ParameterTypeRequiredDescription
filenamestringYesFilename, must end in .csv.
session_idstringNoSession to attach the upload to.

Returns a presigned URL, expiration, the max file size (50 MB), and a ready-to-execute curl command. The response from the upload contains the artifact_id.

futuresearch_browse_lists

Browse FutureSearch's built-in reference lists — companies (S&P 500, FTSE 100, sector breakdowns), geography (countries, US states, cities), people (billionaires, heads of state), institutions (universities, regulators), and more.

Call with no parameters to see all lists. Browsing the full list (~60 entries) is preferred over filters, which require knowing what's there.

ParameterTypeRequiredDescription
searchstringNoMatch against list names (case-insensitive).
categorystringNoFilter by category (e.g. "Finance", "Geography").

Returns names, fields, and artifact_ids to pass to futuresearch_use_list.

futuresearch_use_list

Import a reference list into your session.

ParameterTypeRequiredDescription
artifact_idstringYesAn artifact_id from futuresearch_browse_lists.

Returns task_id, artifact_id, row count, and column names. Pass the returned artifact_id to any operation tool.

Operations

Operation tools dispatch long-running tasks against your data. They share the conventions described first; the per-tool reference tables below only list tool-specific parameters.

Input source

Every operation tool that processes a table accepts exactly one of:

ParameterTypeDescription
artifact_idstring (UUID)A previously-uploaded dataset. Returned by futuresearch_upload_data, futuresearch_request_upload_url, or futuresearch_use_list.
dataJSON list of objectsInline rows. Each object is one row; keys are column names. Capped at 5,000 rows — use uploads for larger datasets.

futuresearch_single_agent and futuresearch_merge are exceptions. See below for details about their inputs.

Session management

Every operation tool optionally accepts:

ParameterTypeDescription
session_idstring (UUID)Add this task to an existing session.
session_namestringHuman-readable name for the session. If session_id is also set, renames the existing session.

When both are omitted, a new session is auto-created.

Async submission

Operation tools return immediately with a task_id. Track progress in one of two ways:

  • MCP clients with widget UIs (Claude.ai, Claude Desktop): call futuresearch_status once. The widget polls automatically and renders results when the task completes.
  • All other clients: poll futuresearch_progress until the task is done, then call futuresearch_results.

Example end-to-end flow:

1. futuresearch_upload_data(source="https://example.com/leads.csv")
   → artifact_id, session_id

2. futuresearch_agent(
     task="Find each company's latest funding round and lead investors",
     artifact_id=...,
   )
   → task_id (~0.6s)

3. futuresearch_progress(task_id=...)
   → "Running: 5/50 complete, 8 running (15s elapsed)" + cursor

4. futuresearch_progress(task_id=..., cursor=...)
   → "Completed: 49 succeeded, 1 failed (142s total)"

5. futuresearch_results(task_id=...)
   → preview rows, total count, download URL

In MCP clients with widget UIs, substitute steps 3–5 with a single futuresearch_status(task_id=...) call — the widget handles polling and result display automatically.

Custom response schemas

All tools that accept response_schema take a JSON Schema object:

{
  "type": "object",
  "properties": {
    "annual_revenue": {
      "type": "integer",
      "description": "Annual revenue in USD"
    },
    "employee_count": {
      "type": "integer",
      "description": "Number of employees"
    }
  },
  "required": ["annual_revenue"]
}

Constraints:

  • Top-level type must be object.
  • properties must be a non-empty object with at most 50 entries.
  • Allowed property types: string, integer, number, boolean, array, object.

futuresearch_rank

Score and sort rows by a qualitative criterion. Reference

ParameterTypeRequiredDescription
taskstringYesDescription of how to assign a score to an individual row.
field_namestringYesOutput column name for the score.
field_typestringNofloat (default), int, str, or bool. Only used if response_schema is not specified.
ascending_orderboolNotrue = lowest first (default).
response_schemaobjectNoJSON Schema for additional output columns. See Custom Response Schemas.
If specified, must contain field_name as a top-level property. Overrides field_type.

futuresearch_classify

Assign each row to one of the provided categories. Reference

ParameterTypeRequiredDescription
taskstringYesClassification instructions.
categorieslist[string]YesAllowed categories (minimum 2). Each row is assigned exactly one.
classification_fieldstringNoOutput column name (default: "classification").
include_reasoningboolNoInclude a reasoning column (default: false).

futuresearch_dedupe

Group semantic duplicates and either select a representative, combine rows, or just label clusters. Reference

ParameterTypeRequiredDescription
equivalence_relationstringYesNatural-language description of what makes two rows duplicates.
strategystringNoselect (default): pick the best representative per cluster.
combine: synthesize a single combined row per cluster.
identify: cluster only, keep all rows.
strategy_promptstringNoInstructions guiding selection or combination (only with select and combine).

futuresearch_merge

LEFT JOIN two tables using LLM-powered entity matching. Reference

The left table is being enriched (all rows kept). The right table is the lookup/reference table: its columns are appended to matches. Unmatched left rows get nulls.

Provide exactly one of left_artifact_id / left_data, and exactly one of right_artifact_id / right_data.

ParameterTypeRequiredDescription
taskstringYesHow to match rows between tables.
left_artifact_idstring*Left table artifact ID.
left_datalist[object]*Inline left table rows.
right_artifact_idstring*Right table artifact ID.
right_datalist[object]*Inline right table rows.
merge_on_leftstringNoColumn name in the left table. Set only if you expect exact string matches on this column, or want to draw agent attention to it.
merge_on_rightstringNoSame, for the right table.
relationship_typestringNomany_to_one (default), one_to_one, one_to_many, or many_to_many. For *_to_many, multiple matches are joined with " | " in each added column.
use_web_searchstringNoauto (default), yes, or no.

futuresearch_forecast

Forecast questions about the future — binary, numeric, or date. Reference

The input table needs at minimum a question column. Recommended: resolution_criteria, resolution_date, background.

ParameterTypeRequiredDescription
forecast_typestringYesbinary — probability (0–100) for yes/no questions. numeric — percentile estimates (p10–p90) for quantity questions. date — date percentile estimates for timing questions.
contextstringNoTable-level context that applies to every row.
effort_levelstringNoLOW (faster, cheaper) or HIGH (more accurate). Defaults: HIGH for a single question, LOW for multiple questions.
output_fieldstringYes for numeric/dateName of the quantity being forecast (e.g. "price", "launch_date").
unitsstringYes for numericUnits (e.g. "USD per barrel", "thousands").

Output columns by mode:

  • binary: probability (int, 0–100) and rationale (string).
  • numeric: {output_field}_p10 … {output_field}_p90 (float), units (string), rationale (string).
  • date: {output_field}_p10 … {output_field}_p90 (YYYY-MM-DD strings), rationale (string).

futuresearch_agent

Run a web research agent on each row. Reference

ParameterTypeRequiredDescription
taskstringYesWhat to research per row.
response_schemaobjectNoOutput structure as JSON Schema. Defaults to {"answer": string}.
effort_levelstringNolow, medium (default), or high. Set to null to use llm / iteration_budget / include_reasoning instead.
llmstringNoSpecific LLM (e.g. CLAUDE_4_6_SONNET_MEDIUM). Only used when effort_level is null.
iteration_budgetint (0–20)NoMax agent iterations per row. Only used when effort_level is null.
include_reasoningboolNoInclude reasoning notes. Only used when effort_level is null.
enforce_row_independenceboolNoIf true, run each row independently without sharing context across rows (default false).

futuresearch_single_agent

Run one web research agent on a single question, with no input table. Use this for one-off research. Reference

ParameterTypeRequiredDescription
taskstringYesWhat to research.
input_dataobjectNoOptional context as inline JSON (e.g. {"company": "Acme"}).
response_schemaobjectNoOutput structure as JSON Schema. Defaults to {"answer": string}.
return_tableboolNoSet true when the task asks for a list of items (e.g. "find 15 startups"). Pair with response_schema defining the per-item fields. Default false (single result row).
effort_level, llm, iteration_budget, include_reasoning—NoSame as futuresearch_agent.

This tool does not take artifact_id / data; it does support session_id / session_name.

futuresearch_multi_agent

Deep parallel research: deploy 3–6 direction agents per row exploring different angles, then synthesize. Use when completeness or depth matters more than per-row cost — e.g. enumerating "all AI startups in Europe", or answers that benefit from parallel investigation across distinct sources, geographies, or methodologies.

ParameterTypeRequiredDescription
taskstringYesWhat to research per row.
directionslist[string] (max 6)NoExplicit research angles. Each should be a detailed, self-contained brief — not a short title. Auto-generated from task if omitted.
response_schemaobjectNoOutput structure for the synthesized result. Defaults to {"answer": string}.
effort_levelstringNolow (3 agents per row), medium (4, default), high (6).

Monitoring and retrieval

futuresearch_progress

Poll a running task. Blocks briefly server-side to limit polling rate. After receiving partial results, briefly comment on the new rows for the user, then immediately call futuresearch_progress again, passing the cursor from the previous response, until the task is terminal.

ParameterTypeRequiredDescription
task_idstringYesThe task ID.
cursorstringNoCursor from the previous response. Pass it to receive only new rows and summaries. Omit on the first call.

Returns status text with completion counts, elapsed time, optional newly-completed rows, and agent summaries.

futuresearch_status

Render a live-updating progress widget. Use this in MCP clients that support widgets (Claude.ai, Claude Desktop). The widget polls automatically and displays results when complete. Do not also call futuresearch_progress.

ParameterTypeRequiredDescription
task_idstringYesThe task ID.

futuresearch_results

Retrieve results from a completed task.

In HTTP mode (the hosted server):

ParameterTypeRequiredDescription
task_idstringYesThe completed task.
offsetintNoRow offset for pagination (default 0).
page_sizeint (1–10000)NoRows to load into the response. For tasks with ≤10 rows, set to the total. For larger tasks, use the page_size value reported in the progress completion message.

Returns a download URL, total row count, and the requested page of rows.

futuresearch_cancel

Cancel a running task.

ParameterTypeRequiredDescription
task_idstringYesThe task ID to cancel.

Returns confirmation. If the task already completed, returns an error with its current state.

futuresearch_list_sessions

List sessions owned by the user, paginated.

ParameterTypeRequiredDescription
offsetintNoSessions to skip (default 0).
limitint (1–1000)NoSessions per page (default 25).

Returns names, IDs, timestamps, and dashboard URLs.

futuresearch_list_session_tasks

List all tasks within a session.

ParameterTypeRequiredDescription
session_idstringYesThe session ID (UUID).

Returns task IDs, types, statuses, timestamps, and any output artifact_ids.

futuresearch_balance

Check the user's billing balance. No parameters.

Returns the current balance in dollars.

futuresearch_task_cost

Get the billed cost of a completed task. There's a delay between task completion and cost calculation; returns pending if not yet settled.

ParameterTypeRequiredDescription
task_idstringYesThe task ID.

Troubleshooting

Auth flow not completing

If the OAuth sign-in window opens but authentication doesn't complete:

  • Ensure pop-ups are not blocked in your browser
  • Try closing and reopening the MCP connection
  • For Claude Code HTTP mode, run /mcp and re-authenticate from the MCP panel

Task stuck in progress

If futuresearch_progress keeps returning a running state for an extended period:

  • Large datasets (1000+ rows) can take 10+ minutes — this is normal
  • Use futuresearch_cancel to stop the task and retry with a smaller dataset

Results appear empty

If futuresearch_results returns fewer rows than expected:

  • Some rows may have failed processing — re-run the operation on the filtered subset
  • Ensure the input CSV was well-formed (proper headers, no encoding issues)

Token budget exceeded

If you get a token budget error when submitting an operation,

  • Upload the input data to the URL returned by futuresearch_request_upload_url. This consumes less context than passing inline data to the operation tool.

If you get a token budget error when fetching results,

  • Save the results to a local file using the URL returned by futuresearch_results.
  • Call futuresearch_results with smaller page size.

Privacy & Support

  • Privacy Policy: futuresearch.ai/privacy
  • Terms of Service: futuresearch.ai/terms
  • Support: support@futuresearch.ai