Grapnel - Documentation

For detailed technical reference - all parameters, progressive reading, cache behavior, configuration - see TECHNICAL_DOCS.md

Getting Started Deploy in under 5 minutes

You need Docker and an API key for one of the supported LLM providers (OpenAI, Anthropic, Gemini, or Ollama).

Clone the repository

git clone <repo-url>
cd grapnel
cp .env.example .env

2
Configure your API key
Edit .env and set your LLM provider and key:
```
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...
```
3
Start the server
```
docker compose up -d --build
```
Launches three containers: Valkey (cache), SearXNG (search), Grapnel (MCP server).
4
Connect your AI assistant
Your server is now available at http://localhost:8881/mcp via streamable-http.

Configuration

All configuration is managed through a single .env file. Copy .env.example and adjust as needed.

Variable	Default	Description
PORT	8881	MCP server port
TRANSPORT	streamable-http	MCP transport (stdio or streamable-http)
SEARXNG_URL	http://searxng:8080	SearXNG URL (Docker internal)
LLM_PROVIDER	openai	openai, anthropic, gemini, or ollama
OpenAI
OPENAI_MODEL	gpt-5.5	OpenAI model name
OpenAI-compatible (local / third-party)
OPENAI_API_BASE	(empty)	API base for DeepSeek, llama.cpp, vLLM, etc.
OPENAI_API_KEY	(empty)	API key for the endpoint
OPENAI_MODEL	(empty)	Model name for the endpoint
ANTHROPIC_MODEL	claude-sonnet-4-6	Anthropic model name
GEMINI_MODEL	gemini/gemini-3.1-pro	Gemini model name
OLLAMA_MODEL	llama3.2	Ollama model name
SUB_LM_MODEL	(empty)	Cheaper model for RLM sub-tasks (e.g. gpt-4o-mini)
SUB_LM_PROVIDER	(empty)	Override provider for Sub-LM; falls back to LLM_PROVIDER
SUB_LM_API_KEY	(empty)	Override API key for Sub-LM
SUB_LM_API_BASE	(empty)	Override base URL for Sub-LM
RLM_MAX_STEPS	8	Max iterations per research query
FETCH_MAX_CHARS	8000	Max characters per page
CACHE_TTL	86400	Page content cache TTL (seconds, 24h)
CACHE_MAX_ENTRIES	512	Max entries in page cache (increase for servers)
VERSION_CACHE_TTL	86400	Version registry cache TTL (seconds, 24h)
STABLE_MODE	true	When true, only exposes production-ready tools; set false for experimental tools

See .env.example for the complete list.

Connecting Your AI MCP client configuration

Claude Desktop — add to claude_desktop_config.json:

{
  "mcpServers": {
    "grapnel": {
      "url": "http://localhost:8881/mcp",
      "transport": "streamable-http"
    }
  }
}

Cursor — in settings under MCP Servers:

{
  "url": "http://localhost:8881/mcp",
  "transport": "streamable-http"
}

Qwen Code CLI — add to .qwen/settings.json:

{
  "mcpServers": {
    "grapnel": {
      "httpUrl": "http://127.0.0.1:8881/mcp",
      "transport": "streamable-http"
    }
  }
}

Claude Code:

{
  "mcpServers": {
    "grapnel": {
      "url": "http://localhost:8881/mcp",
      "transport": "streamable-http"
    }
  }
}

Claude Code (stdio, without Docker):

export TRANSPORT=stdio
python -m grapnel.server

{
  "mcpServers": {
    "grapnel": {
      "command": "python",
      "args": ["-m", "grapnel.server"]
    }
  }
}

Local development without Docker

python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
docker run -d --name searxng -p 8882:8080 searxng/searxng
export SEARXNG_URL=http://localhost:8882
export OPENAI_API_KEY=sk-...
python -m grapnel.server

Tools Reference All 10 endpoints your AI can invoke

web_search

~1s latency0 LLM calls

Quick keyword search via SearXNG meta-search engine. Returns titles, URLs, and snippets. No content fetching.

Parameter	Type	Default	Description
query	string	-	Search query
max_results	int	10	Maximum results to return

fetch_page

~2-5s latency0 LLM calls

Fetch and extract content from a URL. Supports plain text, markdown, HTML, or JSON output.

Parameter	Type	Default	Description
url	string	-	The URL to fetch
max_chars	int	8000	Max characters to return. Use -1 for full page
format	string	"txt"	Output format: "txt", "markdown", "html", or "json"
section	int	1	Section number (1-based) for paginated reading

Returns: {"url", "content", "section_size", "current_section", "total_sections"}

extract_links

~2-5s latency0 LLM calls

Extract all hyperlinks from a page. Optionally filter to the same domain for documentation navigation. Relative and protocol-relative URLs are resolved automatically.

Parameter	Type	Default	Description
url	string	-	The URL to scan
same_domain	bool	true	Only return links to the same domain
max_links	int	50	Max links to return. Increase for comprehensive extraction

Returns: {"url", "links": [...], "count", "total"}

tech_research

~2-5s latency0 LLM callsexperimental

Call this before writing code. Researches current library versions, release dates, and documentation URLs for your implementation task. Prevents stale-API syndrome.

Parameter	Type	Default	Description
task	string	-	Implementation task description
libraries	string	""	Comma-separated library names (auto-detected if empty)

Returns: {"task", "libraries_researched": {...}}

check_version

~0.2s latency0 LLM calls

Fast version check via registry APIs: PyPI, npm, Packagist, crates.io, Go proxy, RubyGems, GitHub Releases. No search, no LLM cost. Results cached for 24 hours.

Parameter	Type	Default	Description
package	string	-	Package name (e.g. "fastapi", "facebook/react")
search	bool	false	Set true for SearXNG fallback

Returns: {"package", "latest_version", "release_date", "source_url", "source", "confidence"}

find_docs

~5-20s latency1-3 LLM calls

Searches official documentation, MDN, and DevDocs. Returns structured API info: signature, description, code example, and warnings.

Parameter	Type	Default	Description
topic	string	-	API or function name
library	string	""	Library or framework name
max_pages	int	3	Max pages (default 3)

Returns: {"topic", "api_signature", "description", "example", "warnings", "sources"}

read_changelog

~5-20s latency1-3 LLM callsexperimental

Parses CHANGELOG.md, release pages, and release notes into structured version entries with breaking changes, features, deprecations, and fixes.

Parameter	Type	Default	Description
repo	string	-	Repository or project name
from_version	string	""	Only include versions from this version onward
max_pages	int	3	Max pages (default 3)

Returns: {"repo", "versions": [...]}

debug_error

~5-20s latency1-3 LLM callsexperimental

Paste an error message or full stack trace. Searches StackOverflow and GitHub, reads solutions, and returns the root cause and fix.

Parameter	Type	Default	Description
error	string	-	Error message or stack trace
context	string	""	Environment context
max_pages	int	5	Max pages (default 5)

Returns: {"error", "error_type", "root_cause", "solution", "sources", "pages_read"}

search_and_answer

~5-20s latency1-3 LLM calls

Searches the web and progressively reads pages until a complete answer is found. Stops as soon as the answer emerges - faster and cheaper than deep research.

Parameter	Type	Default	Description
question	string	-	Question (natural language OK)
max_pages	int	8	Max pages (default 8). Stops early.

Returns: {"question", "answer", "sources": [...], "pages_read"}

research

~30-90s latencyRLM recursive loopexperimental

The flagship tool. Hands the entire problem to a DSPy Recursive Language Model with 7 internal tools: search, fetch, link extraction, version checking, doc lookup, changelog parsing, and error debugging. Returns the full reasoning trajectory with sources at each step.

Supports optional sub-LM — set SUB_LM_MODEL in .env to route internal sub-tasks to a cheaper model.

Runs in a separate thread; the server stays responsive to other requests.

Parameter	Type	Default	Description
question	string	-	Research question
max_steps	int	8	Max RLM iterations

Returns: {"question", "answer", "sources": {...}, "trajectory": [...], "steps_taken"}

Architecture

Three Docker containers on a shared network:

Service	Image	Purpose
valkey	valkey/valkey:8-alpine	Redis-compatible cache for SearXNG
searxng	searxng/searxng:latest	Meta-search engine (API-only, no UI)
grapnel	built from Dockerfile	Python MCP server

Enabled engines: Bing, DuckDuckGo, StackOverflow, GitHub, arXiv.

Python package structure

Module	Responsibility
server.py	FastMCP entry point, tool registration, AI instructions
tools.py	10 MCP tool definitions + DSPy extraction signatures
rlm_agent.py	DSPy RLM agent with 7 internal tools
searxng_client.py	SearXNG HTTP client (async + sync)
fetcher.py	Page fetching + trafilatura extraction + BeautifulSoup parsing
registry.py	Package registry API clients (PyPI, npm, crates.io, etc.)
cache.py	In-memory TTLCache + persistent file cache
config.py	Environment variable loader

RLM internal tools

The research tool creates a DSPy RLM with 7 tools in a sandboxed Python REPL:

searxng_search_sync — web search
fetch_page_sync — page content
extract_links_sync — link discovery
_rlm_check_version — version checking
_rlm_find_docs — API documentation
_rlm_read_changelog — changelog parsing
_rlm_debug_error — error debugging

Research Papers

Grapnel's architecture is motivated by two publications:

Recursive Language Models (arXiv:2512.24601)

Shows LLMs can process inputs orders of magnitude beyond context windows by recursively examining information through external tools. The research tool implements this directly.

Is Grep All You Need? (arXiv:2605.15184)

Proves simple keyword-based retrieval outperforms vector search in agentic contexts. Validates SearXNG keyword search over vector databases.