← All projects

Personal RAG Knowledge Base

A production-ready personal RAG system over 5,000+ knowledge documents — semantic search across notes, meetings, conversations, and tickets, accessible from Claude Desktop, Claude.ai web, and iOS app via the Model Context Protocol.

M1 done P0 Size S Foundation
📅 · Cập nhật
5,329 Sources notes · meetings · tickets
47K+ Chunks VECTOR(384) HNSW
1.16s p95 latency embed → ANN → tunnel
0.85+ Top-hit score multilingual e5
~22h Build time over 2 days
Python 3.10 MCP SDK FastMCP Starlette sentence-transformers multilingual-e5-small Oracle ADB 23ai VECTOR(384) HNSW Cloudflare Tunnel OAuth 2.0 (PKCE + DCR) OCI Object Storage

A production-ready personal RAG system. Semantic search across thousands of personal knowledge documents directly from Claude Desktop, Claude.ai web, or Claude iOS app — via the Model Context Protocol.

At a glance

  • 5,000+ sources / 47,000+ chunks indexed (mixed Vietnamese + English content)
  • Multilingual semantic search — top-hit similarity score 0.85+ on real queries
  • End-to-end p95 latency: 1.16s (embed query → ANN search → tunnel)
  • MCP Streamable HTTP server with OAuth 2.0 (PKCE + DCR) + legacy bearer fallback
  • Persistent HTTPS endpoint via Cloudflare named tunnel
  • Auto-ingest pipeline — files written to a local KB folder are automatically chunked, embedded, and indexed via idempotent SHA-256 hash check
  • Weekly backup to OCI Object Storage via write-only Pre-authenticated Request (zero credentials on the serving VM)
  • Filesystem-first — markdown files are the canonical source of truth; the vector DB is a rebuildable derived index

Stack

Python 3.10 · MCP SDK 1.27 · FastMCP · Starlette + uvicorn · sentence-transformers · multilingual-e5-small · Oracle ADB 23ai (VECTOR type + HNSW) · Cloudflare Tunnel · OCI Object Storage · systemd · launchd

Documentation

DocRead this for
PRDWhat & why — problem framing, goals, scope, milestones, success metrics
ArchitectureSystem diagrams, data flows, component responsibilities, failure modes
ImplementationTech stack, code structure, schema, performance numbers, security model, reproducibility steps
NotesChronological decision log + gotchas + working-session hours

Quickstart for clients

Claude Desktop (macOS):

// ~/Library/Application Support/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "personal-rag-kb": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "https://<your-host>/mcp",
               "--header", "Authorization:${AUTH_HEADER}"],
      "env": { "AUTH_HEADER": "Bearer <your-token>" }
    }
  }
}

Claude.ai web / iOS app: Settings → Connectors → Add Custom Connector → enter your MCP URL → complete OAuth login.

Project status

DayMilestone
1MCP server scaffold + Cloudflare tunnel + bearer auth
2kb_health / kb_ingest / kb_search / kb_stats tools + ADB schema
3Bulk migrate 5,000+ sources / 45K chunks
3bMultilingual upgrade: BGE-en → multilingual-e5-small
4-5Sync workflow refactor — 4 source types auto-ingest
6Persistent named tunnel on a custom domain
7Weekly backup + disaster recovery script
8OAuth 2.0 — Claude.ai web + iOS app access

Total build time: ~22 hours over 2 days.

Foundation for downstream projects

This RAG infrastructure is designed as a shared foundation for multiple downstream AI side-projects: recipe extractor, research agent, support bot, meeting summarizer, finance advisor, health coach. Each can call kb_ingest / kb_search rather than building its own vector pipeline.