Data Sources
What data goes into Brain MCP and how it gets there.
Supported Sources
Claude Code
AutomaticClaude Code project conversations (auto-detected)
Claude Desktop
AutomaticClaude Desktop chat history (auto-detected)
Cursor
AutomaticCursor AI conversations (auto-detected)
Windsurf
AutomaticWindsurf AI conversations (auto-detected)
Gemini CLI
AutomaticGemini CLI conversations (auto-detected)
ChatGPT Exports
On importExported ChatGPT conversations (JSON)
Clawdbot Sessions
On importAI agent conversations via Clawdbot gateway
Generic JSONL
On importCustom sources via JSONL format
Data Flow
Sources (Claude Code, Claude Desktop, Cursor, Windsurf, Gemini CLI, ChatGPT exports)
│
▼
brain-mcp sync (automatic)
│
▼
conversations.parquet (unified format)
│
▼
local embedding (nomic-embed-text-v1.5)
│
▼
vectors/brain.lance (semantic index)
│
▼
brain-mcp serve → 25 MCP toolsWhat Gets Indexed
- • User messages — your actual words and questions
- • Decisions — extracted from conversation context
- • Open questions — unresolved threads and queries
- • Domain tags — auto-classified into 25 domains
Only user messages are embedded (not assistant responses). Messages under 10 characters are skipped as noise.