Personal RAG Knowledge Base
A production-ready personal RAG system over 5,000+ knowledge documents — semantic search across notes, meetings, conversations, and tickets, accessible from Claude Desktop, Claude.ai web, and iOS app via the Model Context Protocol.
A production-ready personal RAG system. Semantic search across thousands of personal knowledge documents directly from Claude Desktop, Claude.ai web, or Claude iOS app — via the Model Context Protocol.
At a glance
- 5,000+ sources / 47,000+ chunks indexed (mixed Vietnamese + English content)
- Multilingual semantic search — top-hit similarity score 0.85+ on real queries
- End-to-end p95 latency: 1.16s (embed query → ANN search → tunnel)
- MCP Streamable HTTP server with OAuth 2.0 (PKCE + DCR) + legacy bearer fallback
- Persistent HTTPS endpoint via Cloudflare named tunnel
- Auto-ingest pipeline — files written to a local KB folder are automatically chunked, embedded, and indexed via idempotent SHA-256 hash check
- Weekly backup to OCI Object Storage via write-only Pre-authenticated Request (zero credentials on the serving VM)
- Filesystem-first — markdown files are the canonical source of truth; the vector DB is a rebuildable derived index
Stack
Python 3.10 · MCP SDK 1.27 · FastMCP · Starlette + uvicorn · sentence-transformers · multilingual-e5-small · Oracle ADB 23ai (VECTOR type + HNSW) · Cloudflare Tunnel · OCI Object Storage · systemd · launchd
Documentation
| Doc | Read this for |
|---|---|
| PRD | What & why — problem framing, goals, scope, milestones, success metrics |
| Architecture | System diagrams, data flows, component responsibilities, failure modes |
| Implementation | Tech stack, code structure, schema, performance numbers, security model, reproducibility steps |
| Notes | Chronological decision log + gotchas + working-session hours |
Quickstart for clients
Claude Desktop (macOS):
// ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"personal-rag-kb": {
"command": "npx",
"args": ["-y", "mcp-remote", "https://<your-host>/mcp",
"--header", "Authorization:${AUTH_HEADER}"],
"env": { "AUTH_HEADER": "Bearer <your-token>" }
}
}
}
Claude.ai web / iOS app: Settings → Connectors → Add Custom Connector → enter your MCP URL → complete OAuth login.
Project status
| Day | Milestone |
|---|---|
| 1 | MCP server scaffold + Cloudflare tunnel + bearer auth |
| 2 | kb_health / kb_ingest / kb_search / kb_stats tools + ADB schema |
| 3 | Bulk migrate 5,000+ sources / 45K chunks |
| 3b | Multilingual upgrade: BGE-en → multilingual-e5-small |
| 4-5 | Sync workflow refactor — 4 source types auto-ingest |
| 6 | Persistent named tunnel on a custom domain |
| 7 | Weekly backup + disaster recovery script |
| 8 | OAuth 2.0 — Claude.ai web + iOS app access |
Total build time: ~22 hours over 2 days.
Foundation for downstream projects
This RAG infrastructure is designed as a shared foundation for multiple downstream AI side-projects: recipe extractor, research agent, support bot, meeting summarizer, finance advisor, health coach. Each can call kb_ingest / kb_search rather than building its own vector pipeline.