health-coach — PRD

1. Problem & Why

The garmin-sync daily job already pulls HR, sleep, stress, and activity data into a JSON data lake (60+ days of continuous coverage, no gaps). Raw data is not actionable on its own — I don’t know whether this week’s sleep is worse than last week’s, can’t spot patterns (afternoon coffee → poor sleep score), and don’t get a coach explaining anything in plain Vietnamese.

The opportunity: layer analytics + LLM coaching on top of data already sitting idle.

2. Goal & Success Metrics

North-star goal: weekly average sleep score ≥ 80 (baseline ~75). Sub-goal: ADHD-friendly delivery — one action at the right moment, no wall-of-text.

Metric	M1 target	M2 target	M3 target
Insight relevance (1–5)	≥3.5	≥4.0	≥4.5
Weekly avg sleep score	–	≥77	≥80
Action follow rate (data proxy)	–	1/wk	3/wk
Manual log compliance	–	≥1 reply/wk	≥3 reply/wk
Nudge fatigue (user disables)	–	0	0

3. User journey (ADHD-mode)

Design principles (set by the user): one thing at a time, just-in-time, traffic-light visual, physical-action-specific. A long Sunday digest becomes noise.

Four trigger types:

Morning readiness ping — 06:30 / 11:00 daily. Single multi-line Telegram with traffic-light status, today’s metrics, weekly goal progress, 2 ranked actions, calendar context.
Contextual nudge — fired at the slot’s natural moment (cap caffeine 13:30, phone-out-bedroom 22:25, etc.) only if today’s plan includes that action.
Anomaly real-time — sleep <5h or RHR z≥1.5 → ping the next morning at 08:00 with skip-workout-and-nap suggestion.
Weekly Saturday 21:00 — compact digest (≤2 insights, 1 action) + 7-dot weekly chart PNG.

Manual log uses Option C — anomaly day prompts + Saturday batch + streak-aware (auto-pause after 3 no-replies).

4. Scope (MoSCoW)

Must: load Garmin JSON; rolling baseline; deviation detection (>1 SD); LLM weekly digest VN; daily morning 1-line ping.

Should: trend chart (PNG), real-time anomaly alert, activity↔sleep correlation, weekly recovery score.

Could: HRV deep analysis, manual coffee/alcohol log, goal tracking (weight, VO2max), partner / family parallel track.

Won’t: medical advice (compliance), replace physician.

Allowed cloud LLM (decision 2026-04-29):

Provider must not train on API data (Anthropic ✓; OpenAI API tier ✓; DeepSeek/GLM ✗).
Send only aggregate metrics (avg/min/max sleep h, sleep score, RHR, HRV, stress, training load + anomaly flags).
Never send raw HR seconds, GPS, sleep-stage timestamps, or PII.
Default provider: Claude Haiku 4.5 (~$0.05/year for one weekly cron).

5. Architecture (TL;DR)

garmin-sync GH Actions → GitHub repo → VM git pull (every 30 min) → DataFrame → stats + correlation + weighting → planner → Haiku LLM (digest only) → Telegram bot.

Full diagram in Architecture.

6. Tech stack

Storage: Garmin JSON data lake (GitHub repo); manual log + nudge log JSONL on VM.
LLM: Claude Haiku 4.5 (default) or local Llama 3.2 3B Ollama (LLM_PROVIDER=ollama for max privacy).
Schedule: systemd user timers on GCP VM with linger enabled.
Delivery: Telegram bot @marc_healthbot (primary), iMessage AppleScript fallback (Mac only).
Calendar: Google Calendar API via OAuth (client + token in ~/.config/health-coach/).

7. Milestones

M1 ✅ 2026-04-29 — walking skeleton: loader + stats + Haiku digest. Sample rated 4.2/5.

M2 ✅ 2026-04-29 — ADHD delivery: morning ping, contextual fire-due, anomaly check, weekly chart, log prompt. Goal-weighted ranking via 11 literature priors.

M3 ✅ 2026-04-30 — Tier 1 personal correlation, Google Calendar pre-meeting nudge, 4-week trend chart, retro analysis (descriptive stats over N days). Migrated from work Mac to GCP VM with systemd timers.

Validation — 4 weeks of live usage to measure effectiveness against weekly avg sleep score ≥80.

8. Cost & quota

Item	Cost	Note
Compute (GCP VM until 2026-06-05)	$0 from $300 credit	After: OCI A1 free or Hetzner $4/mo
Claude Haiku API	~$0.05/year	1 weekly digest + occasional nudge analysis
Telegram bot	$0	Unlimited free
Google Calendar API	$0	Within free quota
Garmin Connect API	$0	Existing rate-limit OK

9. Risks & open questions

Risks:

R1 — Generic insights (“sleep more”). Mitigation: prompt rules (no insight if |Δ|<1 SD), Vietnamese weekday format, exact data references.
R2 — Privacy of health data. Mitigation: aggregate-only to cloud LLM; local Ollama fallback available.
R3 — Garmin gaps (forget to wear watch). Mitigation: loader skips missing files; stats requires ≥7 days week + ≥5 days baseline.

Open (resolved):

✓ HRV available in garmin-sync? Yes (hrv.hrvSummary.lastNightAvg + weeklyAvg).
✓ Quiet hours? Not set initially, will adjust after 1–2 weeks of live data.
✓ Manual log frequency? Option C (anomaly + Saturday batch + streak-aware).
✓ Sticky vs fresh nudges? Fresh daily; anchor sticky weekly target ≥80.

10. Definition of Done

M1 done — 1 weekly digest sample relevance ≥3.5/5 ✓
M2 done — 1 week of live nudges, weekly avg sleep score ≥77, 0 nudge fatigue ✓ (code shipped; live validation ongoing)
M3 done — 4 weeks live usage, weekly avg sleep score ≥80, follow ≥3 actions/week (verified via data proxy + manual log)