For detailed technical reference - all parameters, progressive reading, cache behavior, configuration - see TECHNICAL_DOCS.md

Getting Started Deploy in under 5 minutes

You need Docker and an API key for one of the supported LLM providers (OpenAI, Anthropic, Gemini, or Ollama).

  1. 1
    Clone the repository
    git clone <repo-url>
    cd grapnel
    cp .env.example .env
  2. 2
    Configure your API key

    Edit .env and set your LLM provider and key:

    LLM_PROVIDER=openai
    OPENAI_API_KEY=sk-...
  3. 3
    Start the server
    docker compose up -d --build

    Launches three containers: Valkey (cache), SearXNG (search), Grapnel (MCP server).

  4. 4
    Connect your AI assistant

    Your server is now available at http://localhost:8881/mcp via streamable-http.

Configuration

All configuration is managed through a single .env file. Copy .env.example and adjust as needed.

VariableDefaultDescription
PORT8881MCP server port
TRANSPORTstreamable-httpMCP transport (stdio or streamable-http)
SEARXNG_URLhttp://searxng:8080SearXNG URL (Docker internal)
LLM_PROVIDERopenaiopenai, anthropic, gemini, or ollama
OpenAI
OPENAI_MODELgpt-5.5OpenAI model name
OpenAI-compatible (local / third-party)
OPENAI_API_BASE(empty)API base for DeepSeek, llama.cpp, vLLM, etc.
OPENAI_API_KEY(empty)API key for the endpoint
OPENAI_MODEL(empty)Model name for the endpoint
ANTHROPIC_MODELclaude-sonnet-4-6Anthropic model name
GEMINI_MODELgemini/gemini-3.1-proGemini model name
OLLAMA_MODELllama3.2Ollama model name
SUB_LM_MODEL(empty)Cheaper model for RLM sub-tasks (e.g. gpt-4o-mini)
SUB_LM_PROVIDER(empty)Override provider for Sub-LM; falls back to LLM_PROVIDER
SUB_LM_API_KEY(empty)Override API key for Sub-LM
SUB_LM_API_BASE(empty)Override base URL for Sub-LM
RLM_MAX_STEPS8Max iterations per research query
FETCH_MAX_CHARS8000Max characters per page
CACHE_TTL86400Page content cache TTL (seconds, 24h)
CACHE_MAX_ENTRIES512Max entries in page cache (increase for servers)
VERSION_CACHE_TTL86400Version registry cache TTL (seconds, 24h)

See .env.example for the complete list.

Connecting Your AI MCP client configuration

Claude Desktop — add to claude_desktop_config.json:

{
  "mcpServers": {
    "grapnel": {
      "url": "http://localhost:8881/mcp",
      "transport": "streamable-http"
    }
  }
}

Cursor — in settings under MCP Servers:

{
  "url": "http://localhost:8881/mcp",
  "transport": "streamable-http"
}

Qwen Code CLI — add to .qwen/settings.json:

{
  "mcpServers": {
    "grapnel": {
      "httpUrl": "http://127.0.0.1:8881/mcp",
      "transport": "streamable-http"
    }
  }
}

Claude Code:

{
  "mcpServers": {
    "grapnel": {
      "url": "http://localhost:8881/mcp",
      "transport": "streamable-http"
    }
  }
}

Claude Code (stdio, without Docker):

export TRANSPORT=stdio
python -m grapnel.server
{
  "mcpServers": {
    "grapnel": {
      "command": "python",
      "args": ["-m", "grapnel.server"]
    }
  }
}

Local development without Docker

python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
docker run -d --name searxng -p 8882:8080 searxng/searxng
export SEARXNG_URL=http://localhost:8882
export OPENAI_API_KEY=sk-...
python -m grapnel.server

Tools Reference All 10 endpoints your AI can invoke

fetch_page

~2-5s latency0 LLM calls

Fetch and extract content from a URL. Supports plain text, markdown, HTML, or JSON output.

ParameterTypeDefaultDescription
urlstring-The URL to fetch
max_charsint8000Max characters to return. Use -1 for full page
formatstring"txt"Output format: "txt", "markdown", "html", or "json"
sectionint0Section number (0-based) for paginated reading

Returns: {"url", "content", "section", "sections"}

tech_research

~2-5s latency0 LLM calls

Call this before writing code. Researches current library versions, release dates, and documentation URLs for your implementation task. Prevents stale-API syndrome.

ParameterTypeDefaultDescription
taskstring-Implementation task description
librariesstring""Comma-separated library names (auto-detected if empty)

Returns: {"task", "libraries_researched": {...}}

check_version

~0.2s latency0 LLM calls

Fast version check via registry APIs: PyPI, npm, Packagist, crates.io, Go proxy, RubyGems, GitHub Releases. No search, no LLM cost. Results cached for 24 hours.

ParameterTypeDefaultDescription
packagestring-Package name (e.g. "fastapi", "facebook/react")
searchboolfalseSet true for SearXNG fallback

Returns: {"package", "latest_version", "release_date", "source_url", "source", "confidence"}

find_docs

~5-20s latency1-3 LLM calls

Searches official documentation, MDN, and DevDocs. Returns structured API info: signature, description, code example, and warnings.

ParameterTypeDefaultDescription
topicstring-API or function name
librarystring""Library or framework name
max_pagesint3Max pages (default 3, capped at 5)

Returns: {"topic", "api_signature", "description", "example", "warnings", "sources"}

read_changelog

~5-20s latency1-3 LLM calls

Parses CHANGELOG.md, release pages, and release notes into structured version entries with breaking changes, features, deprecations, and fixes.

ParameterTypeDefaultDescription
repostring-Repository or project name
from_versionstring""Only include versions from this version onward
max_pagesint3Max pages (default 3, capped at 5)

Returns: {"repo", "versions": [...]}

debug_error

~5-20s latency1-3 LLM calls

Paste an error message or full stack trace. Searches StackOverflow and GitHub, reads solutions, and returns the root cause and fix.

ParameterTypeDefaultDescription
errorstring-Error message or stack trace
contextstring""Environment context
max_pagesint5Max pages (default 5, capped at 5)

Returns: {"error", "error_type", "root_cause", "solution", "sources", "pages_read"}

search_and_answer

~5-20s latency1-3 LLM calls

Searches the web and progressively reads pages until a complete answer is found. Stops as soon as the answer emerges - faster and cheaper than deep research.

ParameterTypeDefaultDescription
questionstring-Question (natural language OK)
max_pagesint3Max pages (default 3, capped at 5). Stops early.

Returns: {"question", "answer", "sources": [...], "pages_read"}

research

~30-90s latencyRLM recursive loop

The flagship tool. Hands the entire problem to a DSPy Recursive Language Model with 7 internal tools: search, fetch, link extraction, version checking, doc lookup, changelog parsing, and error debugging. Returns the full reasoning trajectory with sources at each step.

Supports optional sub-LM — set SUB_LM_MODEL in .env to route internal sub-tasks to a cheaper model.

Runs in a separate thread; the server stays responsive to other requests.

ParameterTypeDefaultDescription
questionstring-Research question
max_stepsint8Max RLM iterations

Returns: {"question", "answer", "sources": {...}, "trajectory": [...], "steps_taken"}

Architecture

Three Docker containers on a shared network:

ServiceImagePurpose
valkeyvalkey/valkey:8-alpineRedis-compatible cache for SearXNG
searxngsearxng/searxng:latestMeta-search engine (API-only, no UI)
grapnelbuilt from DockerfilePython MCP server

Enabled engines: Bing, DuckDuckGo, StackOverflow, GitHub, arXiv.

Python package structure

ModuleResponsibility
server.pyFastMCP entry point, tool registration, AI instructions
tools.py10 MCP tool definitions + DSPy extraction signatures
rlm_agent.pyDSPy RLM agent with 7 internal tools
searxng_client.pySearXNG HTTP client (async + sync)
fetcher.pyPage fetching + trafilatura extraction + BeautifulSoup parsing
registry.pyPackage registry API clients (PyPI, npm, crates.io, etc.)
cache.pyIn-memory TTLCache + persistent file cache
config.pyEnvironment variable loader

RLM internal tools

The research tool creates a DSPy RLM with 7 tools in a sandboxed Python REPL:

  1. searxng_search_sync — web search
  2. fetch_page_sync — page content
  3. extract_links_sync — link discovery
  4. _rlm_check_version — version checking
  5. _rlm_find_docs — API documentation
  6. _rlm_read_changelog — changelog parsing
  7. _rlm_debug_error — error debugging

Research Papers

Grapnel's architecture is motivated by two publications:

Recursive Language Models (arXiv:2512.24601)

Shows LLMs can process inputs orders of magnitude beyond context windows by recursively examining information through external tools. The research tool implements this directly.

Is Grep All You Need? (arXiv:2605.15184)

Proves simple keyword-based retrieval outperforms vector search in agentic contexts. Validates SearXNG keyword search over vector databases.