← Back to project
● Shipped P2 Size M Vertical app

health-coach — Notes & Decision Log

Chronological decision log, gotchas, working-session highlights.

Decisions

2026-04-26 — Folder convention

Standalone folder Side.Projects/health-coach/; uses Side.Projects/garmin-sync/ as data source layer (don’t merge the two).

2026-04-29 — LLM strategy reverse: local 8B → Claude Haiku 4.5

Initially planned local Llama 3 8B on the GCP VM. Tested in practice: 8B on x86 4 vCPU without GPU is ~256s for 9 tokens (memory pressure with personal-rag-kb). 3B Llama works at ~50s warm but quality fails the M1 DoD (hallucinates “thiếu ngủ” for Δ=-0.09h, paraphrases dates incorrectly).

Switched to Claude Haiku 4.5 API as default: 8.8s end-to-end, ~$0.001 per digest, ~$0.05/year for one weekly cron. Haiku reference exact dates (04-26, 04-21, 04-25) and respects |Δ|<1 SD → no insight prompt rule. Sample rated 4.2/5, closes M1 DoD.

Privacy compromise documented in PRD §4: aggregate metrics only, provider must not train on API data, no raw HR/GPS/sleep stages.

2026-04-29 — ADHD delivery design

User has ADHD: long Sunday digest = unread = useless. Redesigned to four trigger types: morning ping, contextual nudge, anomaly real-time, weekly Saturday compact digest + chart. Smart cap (≤2 normal day, ≤4 anomaly day, hard cap 5). Goal-weighted action ranking with three tiers (Tier 1 personal data → Tier 2 literature → Tier 3 LLM gut). Manual log Option C — anomaly + Saturday batch + streak-aware (auto-pause after 3 no-replies).

2026-04-29 — Hosting roadmap

  • Now: GCP VM (free credit until 2026-06-05).
  • After: OCI A1 free retry (running) → Hetzner CAX11 ARM 4GB ($4/mth) as fallback.
  • Avoid: GCP paid ($107/mth) or work Mac (work-laptop dependency).

2026-04-30 — Pivot iMessage → Telegram

3 iMessage quirks combined to silently drop nudges (see Gotchas). Switched to Telegram bot @marc_healthbot. No rate-limit, no spam filter, no dedupe, multi-device sync.

2026-04-30 — Migrate work Mac → GCP VM

Telegram is HTTP API → host-independent. Moved 6 launchd plists → 7 systemd user timers on VM (loginctl enable-linger ubuntu). Work Mac plists renamed .disabled (kept for rollback). Schedule converted UTC = VN-7. Effort: ~1 hour.

2026-04-30 — Week boundary

Week = Sunday → Saturday (user preference). weekly-digest + weekly-chart + log-prompt fire Saturday 21:00–21:30 VN. systemd Weekday=6 ≠ Python date.weekday()=5 (different conventions).

2026-04-30 — Digest format: VN weekday

Date format YYYY-MM-DD is hard for ADHD users to recall. Pre-compute Thứ 5, Chủ nhật in render_summary_for_prompt, enforce in SYSTEM_PROMPT.

Gotchas

  • launchd jobs in ~/Documents/ are blocked by macOS Sequoia/Tahoe TCC — Operation not permitted. Move project to ~/health-coach/ and ~/garmin-sync/ outside ~/Documents/.
  • macOS security keychain is unreliable from launchd context (UI prompt blocked). Switched to ~/.config/health-coach/anthropic_api_key chmod 600 file.
  • iMessage AppleScript silently fails if Messages.app isn’t running. Wrapper does open -ga Messages before exec. Even with that, may need delay 2 after activate to flush queue.
  • iMessage dedupe drops messages with identical bodies sent close in time. Fix: append nonce {unix-timestamp}.
  • iMessage per-recipient rate-limit — >10 messages in 1–2 hours to the same handle silently drop. All other handles still work. Production load (2–5/day) doesn’t trigger.
  • iMessage spam classifier flags template-looking content: multi-line + bullet, 78.0, 34.0, 55 decimal-comma patterns, 30min/10min compound number+unit, ± special char, short paren codes (nXXXX). Fix: conversational single-line Vietnamese, integers, “10 phút” not “10min”.
  • DejaVu Sans (matplotlib default) is missing emoji glyphs — keep emoji out of plot titles.
  • OAuth urn:ietf:wg:oauth:2.0:oob flow was deprecated by Google in 2022. Use flow.run_local_server(port=0, open_browser=True) on a machine with a browser, then SCP google_token.json to headless VM.
  • xcode-select --install sometimes reports “already installed” while xcode-select -p says no path. macOS 26 ships CLT bundled; just use git --version to confirm.
  • Apple iMessage delivery latency can be 0–5 minutes when the same Mac sends bursts (anti-spam buffering). Production load looks instantaneous.
  • oauth-token reuse across libraries — same Google OAuth client (MCP AI desktop type) works for gcalcli, google-api-python-client, etc., as long as the scope grants Calendar read.

Working session log

  • 2026-04-29 (Day 1) — M1 walking skeleton: loader (12 metrics), stats (rolling baseline + anomaly z≥1.5), digest (deterministic + Ollama streaming + Haiku). Sample digest rated 4.2/5. PRD §4 update to allow cloud LLM with constraints. ROADMAP decision log entries for LLM strategy and hosting.
  • 2026-04-29 (Day 1, evening) — M2: delivery.py (Telegram + iMessage + macOS notif), weighting.py (11 literature priors), planner.py (smart cap, traffic light), manual_log.py (Option C), wired CLI subcommands. Deployed to work Mac with 6 launchd plists + caffeinate. Verified end-to-end via SSH + launchd kickstart.
  • 2026-04-30 (Day 2, morning) — M3: correlation.py (Tier 1 personal lag-1), calendar_reader.py (Google iCal first, then OAuth), chart.build_trend_chart() (4-week multi-metric), weighting.merge_personal_priors(), digest extended to cite Tier 1 priors. Calendar discovery: LL Workspace blocks public iCal, switched to Google API OAuth.
  • 2026-04-30 (Day 2, mid-day) — Apple iMessage debugging marathon: discovered the 3 quirks (dedupe, rate-limit, spam classifier). Pivot to Telegram bot @marc_healthbot. Created bot via @BotFather, fetched chat_id from /getUpdates, added telegram_photo() for chart upload via multipart form.
  • 2026-04-30 (Day 2, afternoon) — Migrate work Mac → GCP VM. rsync code + secrets. Setup git deploy key on VM. Convert 6 launchd plists → 7 systemd user timers. Smoke test passes. Work Mac plists renamed .disabled.
  • 2026-04-30 (Day 2, evening) — Retro analysis on 63 days surfaced patterns: Wednesday is anh’s worst sleep day (avg 68) vs Saturday best (avg 80, Δ 12 points). HRV percentiles for personal threshold (<36 = poor recovery, >41 = good). RHR stable over 7 weeks. Workout vs rest day → no significant sleep impact (good — no need to defer training). These priors will replace Tier 2 literature defaults once n≥10 per action accumulates.