← Back to project
● M1 done P0 Size S Foundation

personal-rag-kb — Notes

Chronological decision log, gotchas, and working-session hours.

Notes & Decision Log

Format: YYYY-MM-DD — context — decision/finding.

Decisions

  • 2026-04-26 — Foundation: chose ADB 23ai (vector built-in) over Qdrant + local LLM serving. Selected commodity Linux VM (4-core CPU, 16 GB RAM) for compute.
  • 2026-04-27 (Day 1) — Pivot Telegram bot → MCP Streamable HTTP server. Reason: Claude clients (Desktop/web/iOS) call tools natively, no Bot UI needed. Same/better UX, less code.
  • 2026-04-27 (Day 1) — Bearer token auth first; OAuth deferred to Day 8.
  • 2026-04-27 (Day 1) — Cloudflare ephemeral quick tunnel for prototyping; persistent named tunnel deferred to Day 6.
  • 2026-04-27 (Day 2) — Schema final: kb_sources (1 row/file) + kb_chunks (FK, VECTOR(384) FLOAT32). Used source_uri UNIQUE + content_hash for idempotency rather than a synthetic doc_id hash.
  • 2026-04-27 (Day 2) — HNSW index INMEMORY NEIGHBOR GRAPH COSINE TARGET ACCURACY 95. Build deferred until after bulk migrate so the index has data to learn from.
  • 2026-04-27 (Day 3) — Bulk migrate strategy: rsync .md files only (~46 MB) up to VM, then run script with local DB connection — much faster than 5K HTTPS calls.
  • 2026-04-27 (Day 3) — Chunk size 600 → 256 tokens + 32 overlap. Reason: BGE-small max_seq=512, sequence length scales ~quadratically for CPU embed throughput, 256 tok = 3× faster.
  • 2026-04-27 (Day 3) — Cap 50 chunks/file. Huge transcripts only embed the head (≈12.8 K tokens, ~25 dense pages).
  • 2026-04-27 (Day 3) — Process priority: source-of-truth docs first, low-retrieval-value transcripts last.
  • 2026-04-27 (Day 3b)Switch BGE-small-en-v1.5 → multilingual-e5-small. Benchmarked BGE-m3 (1024-dim, 568M params) at 0.39 chunks/s on CPU → 32 h re-embed (rejected). e5-small at 117M params = 5.7 chunks/s → 2.2 h. VN query verbatim score jumped 0.69–0.77 → 0.85+.
  • 2026-04-27 (Day 3b) — e5 prefix convention: query: for queries, passage: for documents. Affects retrieval significantly.
  • 2026-04-27 (Day 4-5) — Filesystem-first ingest rule: skill writes .md → calls kb-ingest-file.py. Filesystem = canonical, ADB = derived index. Sync 1-way only.
  • 2026-04-27 (Day 4-5) — Generic helper kb-ingest-file.py. Auto-detects source_type + tags from 16 path rules. Idempotent via server hash check.
  • 2026-04-27 (Day 6) — Persistent tunnel via Cloudflare named tunnel on a custom domain. DNS CNAME → tunnel UUID. Free tier sufficient.
  • 2026-04-29 (Day 7) — Backup architecture: write-only Pre-authenticated Request (PAR) instead of putting an OCI API key on the VM. Reason: a VM compromise must not be able to delete bucket contents. Restore ops run from a separate operator host with full OCI CLI.
  • 2026-04-29 (Day 8) — OAuth 2.0 via the MCP SDK’s OAuthAuthorizationServerProvider Protocol. Single-user, file-backed JSON store, PBKDF2 200K rounds. Legacy bearer token still works through the provider’s load_access_token() fallback (Claude Desktop unbroken).

Gotchas

  • Day 1 — MCP SDK DNS rebinding protection defaults to True → blocked through Cloudflare proxy. Disable: TransportSecuritySettings(enable_dns_rebinding_protection=False). Bearer token is sufficient as the security boundary.
  • Day 1 — FastMCP streamable_http_app() mount path is /mcp, not /mcp/. Mounting at /mcp/ causes a 307 redirect that drops the Authorization header.
  • Day 1 — Claude.ai web custom connectors REQUIRE OAuth — they don’t accept a raw bearer token. Worked around with Claude Desktop + mcp-remote bridge until Day 8 added proper OAuth.
  • Day 1mcp-remote bridge --header "Name:value" syntax has issues with spaces. Workaround: ${AUTH_HEADER} env var injection in Claude Desktop config.
  • Day 1 — Print buffering: bulk migrate script print() calls invisible until process exits. Fix: sys.stdout.reconfigure(line_buffering=True) + PYTHONUNBUFFERED=1.
  • Day 3 — torch defaults to 2 threads on a 4-core VM. Fix: torch.set_num_threads(4) + OMP_NUM_THREADS=4. CPU usage went 189% → 380%.
  • Day 4-5 — Cloudflare bot management blocked the default Python urllib User-Agent with 403. Fix: set a custom User-Agent header. (curl works out of the box.)
  • Day 4-5 — Hook timeout of 30s wasn’t enough for large transcripts (300+ chunks). Bumped to 120s.
  • Day 4-5 — Files with a newline character in the filename made the reconcile script falsely flag an orphan. Hash check confirmed validity. Reconcile script needed escape-aware comparison.
  • Day 7 — macOS Privacy sandbox blocks launchd from executing scripts in ~/Documents/. Fix: copy scripts to a non-sandboxed location.
  • Day 8OAuthAuthorizationServerProvider is a Protocol (Generic), not an ABC. Implement via duck-typing — no need to subclass. Pydantic models for OAuthClientInformationFull, AuthorizationCode, etc.

Working-session log

DateHoursWhatOutcome
2026-04-26~3 hCloud foundation (VM provisioning, ADB 23ai, wallet)Compute + DB ready.
2026-04-27 morning~2 hDay 1 MCP scaffold + tunnelkb_health live
2026-04-27 midday~3 hDay 2 schema + tools + bench4 tools working, BGE-small loaded
2026-04-27 afternoon~2 h (+ ~95 min wall background)Day 3 bulk migrate5K+ sources / 45K chunks
2026-04-27 late~3 h (+ ~131 min wall)Day 3b multilingual upgradee5-small re-embedded all chunks
2026-04-27 evening~3 hDay 4-5 sync refactor4 source types auto-ingest
2026-04-28~1 hDay 6 persistent tunnelnamed tunnel live
2026-04-29 morning~2 hDay 7 backup + DRWeekly cron, restore tested
2026-04-29 afternoon~3 hDay 8 OAuthClaude mobile + web access
Total~22–25 hoursM1 doneFoundation for downstream projects