← Back to project
● M1 done P0 Size S Foundation

personal-rag-kb — Architecture

System diagrams, data flows, component responsibilities, failure modes.

Architecture

Sister docs: PRD (intent), Implementation (deep-dive), Notes (decision log).

System view

flowchart TB
    classDef client fill:#1c1c2c,stroke:#67e8f9,color:#f4f4f8
    classDef edge fill:#1c1c2c,stroke:#a78bfa,color:#f4f4f8
    classDef server fill:#14141f,stroke:#67e8f9,color:#f4f4f8
    classDef store fill:#1c1c2c,stroke:#34d399,color:#f4f4f8

    subgraph Clients["👤 User devices (Mac + iPhone)"]
        Desktop["Claude Desktop
(legacy bearer)"] Web["Claude.ai web
(OAuth 2.0)"] iOS["Claude iOS app
(inherits web)"] Bridge["mcp-remote bridge
(npx)"] Desktop --> Bridge end Bridge --> CF Web --> CF iOS --> CF subgraph CF["☁️ Cloudflare (custom domain)"] Tunnel["Named Tunnel
DNS CNAME · TLS · Bot mgmt"] end subgraph VM["🖥️ Compute VM · 4 vCPU · systemd"] CFD["cloudflared agent"] Server["uvicorn + Starlette
FastMCP server v0.4"] Tools["MCP Tools
kb_health · kb_ingest
kb_search · kb_stats"] Embed["e5-small (CPU)
multilingual"] OAuth["OAuth 2.0
/authorize /token /register"] CFD --> Server Server --> OAuth Server --> Tools Tools --> Embed end Tunnel --> CFD subgraph Storage["🗄️ Oracle ADB 23ai (free)"] Sources["kb_sources
5,000+ rows"] Chunks["kb_chunks
47K+ rows · VECTOR(384)"] HNSW["HNSW INMEMORY
NEIGHBOR GRAPH"] end Tools -.oracledb TLS.-> Storage subgraph Backup["📦 OCI Object Storage"] Bucket["kb-backups bucket
tar.gz weekly"] end Storage -.PAR write-only.-> Bucket class Desktop,Web,iOS,Bridge client class Tunnel edge class CFD,Server,Tools,Embed,OAuth server class Sources,Chunks,HNSW,Bucket store
                        │ weekly Sunday 03:00 UTC

    ┌──────────────────────────────────────────────────┐
    │  OCI Object Storage `kb-backups` bucket          │
    │  Pre-authenticated Request (write-only, 1 year)  │
    │  kb-backup-YYYYMMDDTHHMMSSZ.tar.gz (~86 MB each) │
    └──────────────────────────────────────────────────┘

## Data flow — Ingest

[1] Trigger event (one of): - AI session ends → user-shell hook archives transcript - Chat exporter / wiki crawler runs → writes file → calls helper - Mail digest task runs → writes file → calls helper - Meeting transcriber runs → writes file → calls helper - Ticket crawler renders MD → calls helper - Nightly scheduled task → runs all sync jobs


[2] Markdown file written to //.md (filesystem = canonical source-of-truth)


[3] Hook calls: python3 /kb-ingest-file.py


[4] Helper: - reads file → text - classify source_type from path (16 path-based rules) - auto-tag from folder hierarchy - HTTPS POST /mcp tools/call kb_ingest


[5] MCP server kb_ingest tool: - SHA256(text) → content_hash - SELECT WHERE source_uri = :u - row exists + same hash → return {skipped: true} ← idempotency - row exists + diff hash → DELETE old chunks, UPDATE source - new → INSERT source RETURNING id - chunk_text(text) → list[str] (256 tok / 32 overlap) - model.encode([“passage: ” + c for c in chunks]) → list[float[384]] - INSERT INTO kb_chunks … (executemany) - COMMIT


[6] Return {source_id, chunks_inserted, skipped, hash} Logged to /kb-ingest-file.log


## Data flow — Query

[1] User → Claude (any client): “Anything new about this week?”


[2] Claude reasons → calls MCP tool kb_search POST https:///mcp Authorization: Bearer <access_token> Accept: text/event-stream { “jsonrpc”: “2.0”, “id”: N, “method”: “tools/call”, “params”: { “name”: “kb_search”, “arguments”: {“query”: “Anything new about this week?”, “top_k”: 5} } }


[3] Cloudflare tunnel → VM uvicorn → middleware auth check - validates Bearer token via FileOAuthProvider.load_access_token() - allows request to proceed


[4] FastMCP routes to kb_search handler: - qvec = model.encode([“query: ”])[0] - SELECT c.text, s.source_uri, s.source_type, s.title, s.tags, c.chunk_idx, VECTOR_DISTANCE(c.embedding, :q, COSINE) AS dist FROM kb_chunks c JOIN kb_sources s ON s.id = c.source_id WHERE 1=1 [+ optional tag/source_type filter] ORDER BY dist FETCH FIRST 5 ROWS ONLY - HNSW vector index used (TARGET ACCURACY 95)


[5] Format results: [ {score: 0.857, text: ”…”, source_uri: “file:///…meeting.md”, source_type: “meeting”, title: ”…”, tags: […], chunk_idx: 0}, … ]


[6] Streamable HTTP returns SSE event: event: message data: {“jsonrpc”: “2.0”, “id”: N, “result”: {…}}


[7] Claude reads results, synthesizes natural-language answer with citations.


## Data flow — Backup

[Sunday 03:00 UTC] │ systemd timer kb-backup.timer fires ▼ [backup.py on VM] │ dump kb_sources → JSONL.gz │ dump kb_chunks (text + hex-encoded embedding) → JSONL.gz │ write schema.json + manifest.json │ tar.gz all │ │ curl PUT $PAR_URL + filename ▼ [OCI Object Storage] bucket: kb-backups object: kb-backup-YYYYMMDDTHHMMSSZ.tar.gz │ │ accessible via OCI CLI on operator host ▼ [Local cleanup on VM] keep last 2 archives in /opt/…/backups/


## Data flow — Disaster recovery

[ADB instance gone or corrupted] │ ▼ [Operator on local host] oci os object list —bucket-name kb-backups oci os object get —name kb-backup-XXX.tar.gz —file /tmp/restore.tar.gz scp /tmp/restore.tar.gz :/tmp/ │ ▼ [On VM] /opt/…/venv/bin/python src/restore.py /tmp/restore.tar.gz │ extract tar.gz → tmpdir │ verify manifest.json matches expected counts │ prompt “yes” to confirm DROP │ DROP TABLE kb_chunks, kb_sources CASCADE PURGE │ recreate tables (DDL hardcoded) │ stream JSONL → executemany() with rehydrated VECTOR │ rebuild HNSW index ▼ [Verify] kb_health → confirm sources/chunks count = manifest kb_search “test” → confirm vector index works


## Component responsibilities

| Component | Owns | Doesn't own |
|---|---|---|
| **Filesystem** (`<kb-root>/`) | Source of truth, plain markdown | Search, embedding, indexing |
| **Sync agents** (skills/hooks/crawler) | Pulling from external sources, writing MD files, calling kb-ingest | Embedding, storage |
| **`kb-ingest-file.py`** | Path → kb_ingest API call, classification, retry-tolerant | Embedding, vector storage |
| **MCP server** | Auth, chunking, embedding, ADB coordination | Source acquisition (sync), client UI |
| **e5-small model** | Vectorize text (passage/query) | Storage, retrieval |
| **ADB 23ai** | Persist sources + chunks + vectors, ANN search via HNSW | Embedding, business logic |
| **Cloudflare tunnel** | Public HTTPS endpoint, TLS | Auth (delegated to MCP server) |
| **Claude clients** | UI, LLM reasoning, tool calling | Embeddings, retrieval |
| **OCI Object Storage** | Cold backup persistence | Live serving |

## Failure modes & recovery

| Failure | Detect | Recovery | Time |
|---|---|---|---|
| MCP server crash | systemd `Restart=on-failure` | Auto-restart in 5s | <10s |
| cloudflared crash | systemd | Auto-restart, tunnel re-establishes | <30s |
| ADB auto-stop (7-day idle) | `kb_health` returns ADB error | Manual start via console, or pre-empted by normal usage | ~1 min |
| ADB data corruption | Filesystem hash mismatch on reconcile | Restore from latest weekly backup | ~5 min restore + rebuild |
| ADB instance terminated | Loss of connection | Provision new instance, restore from backup, update wallet | ~30 min |
| VM terminated | Health check fails | Provision new VM, redeploy services, restore wallet/tokens | hours |
| Cloudflare tunnel UUID lost | DNS still resolves but no backend | Recreate tunnel, update DNS, update systemd service | ~5 min |
| OAuth state file corrupted | All logins fail | Delete state file → all clients re-register on next use | <1 min user impact |
| Password forgotten | `/login` always 401 | SSH VM, regenerate `.oauth_env`, restart server | <2 min (if SSH key intact) |

## Why these choices

| Decision | Alternative considered | Why this won |
|---|---|---|
| MCP over Telegram bot | Build custom Telegram bot | Native Claude integration, no Bot UI needed, multi-client (web/desktop/iOS) for free |
| ADB 23ai over Qdrant | Qdrant on VM | Free 20 GB tier, vector type built into SQL, no separate service to manage |
| e5-small over BGE-m3 | BGE-m3 multilingual (1024-dim) | CPU benchmark showed BGE-m3 = 0.39 chunks/s — too slow for batch re-embed. e5-small at 5.7 chunks/s with comparable retrieval quality |
| Commodity VM over managed runtime | Managed container service | Lower fixed cost, full systemd control, easy provider migration |
| Cloudflare named tunnel over public IP + nginx | Direct IP exposure | No firewall management, free TLS, easy DNS, persistent URL |
| Filesystem-first over ADB-first | Direct ingest into ADB | ADB free tier can be evicted; filesystem is durable; ADB is re-buildable |
| Write-only PAR over OCI key on VM | Copy OCI API key to VM | VM compromise can't delete/modify backups; principle of least privilege |
| Single password OAuth over OIDC delegation | Auth via Google/GitHub OIDC | Single-user system; simpler to implement; offline-capable; SSO can layer on later |
| FastMCP SDK auth over hand-rolled OAuth | Hand-roll RFC 8414 + 7591 | SDK provides metadata + DCR + token endpoints out-of-box; just need provider impl |

## See also

- Sequence diagrams in [Implementation](./implementation) (OAuth flow, ingest, query)
- Performance numbers in [Implementation](./implementation)