Lifestream auto-event sources — three streams (GitHub webhooks + Wakatime daily + CF Web Analytics daily)
Lifestream auto-event sources — three streams
Decision
The oriz-me JSONL canonical store
is fed by three auto-tracked event sources, and ONLY these three.
No manual entry, no IDE-heartbeat raw stream, no per-pageview hit
capture. Each source streams day-grain (or event-grain for git) JSONL
lines into chirag127/oriz-me-data/events-<YYYY>.jsonl.
| # | Source | Trigger | Grain | JSONL kind |
|---|---|---|---|---|
| 1 | GitHub webhooks | Real-time per-event | per-event | git |
| 2 | Wakatime daily summary | Daily cron 01:00 IST | per-day | coding |
| 3 | CF Web Analytics daily summary | Daily cron 01:00 IST | per-day per-site | visitors |
This locks in the
auto-only-tracking posture for
the lifestream itself — every event in oriz-me arrives without a
human pressing "log this".
Source 1: GitHub webhooks
GitHub repo webhook ? Hookdeck
ingress (retries + replay + dead-letter) ? CF Worker route at
api.oriz.in/lifestream/git ? JSONL append.
Subscribed events:
push(any branch — but family ismain-only perone-branch-only)pull_requestopenedreleasepublishedworkflow_runcompleted (onlysuccess/failureterminal states)
JSONL line shape:
{"ts": "2026-06-20T11:42:13Z", "kind": "git", "repo": "chirag127/blog-site", "sha": "abc1234", "message": "feat: ship", "author": "chirag127", "url": "https://github.com/chirag127/blog-site/commit/abc1234"}
Idempotent on (repo, sha) for push, (repo, pr_number) for PR,
(repo, tag) for release, (repo, run_id) for workflow_run. Hookdeck
replay-safe.
Source 2: Wakatime daily-summary cron
GitHub Actions schedule
0 1 * * * (01:00 IST = 19:30 UTC previous day) fetches
https://wakatime.com/api/v1/users/current/summaries?start=YYYY-MM-DD&end=YYYY-MM-DD,
maps to one JSONL line per day, appends.
Stored only at day-grain, not minute-grain — keeps PII low and sidesteps Wakatime's rolling 2-week free history (per the wakatime service file) by exporting every day before it ages out.
JSONL line shape:
{"ts": "2026-06-20T18:30:00Z", "kind": "coding", "date": "2026-06-20", "total_seconds": 14823, "languages": [{"name": "TypeScript", "seconds": 9120}, {"name": "Markdown", "seconds": 3201}], "projects": [{"name": "oriz-blog-site", "seconds": 6044}, {"name": "oriz", "seconds": 8779}]}
Wakatime API token in
Doppler ? GH Secrets per
secrets-management-doppler.
Source 3: CF Web Analytics daily-summary cron
GitHub Actions schedule 0 1 * * * fetches Cloudflare's GraphQL
Analytics API (https://api.cloudflare.com/client/v4/graphql) for
each site zone, maps to one JSONL line per (date, site), appends.
JSONL line shape:
{"ts": "2026-06-20T18:30:00Z", "kind": "visitors", "date": "2026-06-20", "site": "blog.oriz.in", "pageviews": 1142, "unique": 318, "top_paths": [{"path": "/post/foo", "pv": 412}, {"path": "/", "pv": 287}]}
Eleven sites × one line per day = 11 JSONL events / day. Negligible file growth at the JSONL canonical "~10 MB/year" envelope.
CF API token (read-only Analytics scope) in Doppler.
Why these three only
The user direction was: "ALL THREE — GitHub webhooks + Wakatime daily-summary + CF Web Analytics summary all stream to JSONL" — and nothing else. Each source covers a distinct surface:
- Git = what code changed and when (the family's primary durable artefact)
- Coding time = how long was spent coding, in what language, on what project (the activity behind the commits)
- Visitors = who read the output (the receiving end of the work)
Together they answer "what was I doing yesterday?" without any tool that requires a human to press "start timer" or "log this". Manual non-coding time tracking lives separately in Toggl Track per
;that source is explicitly NOT wired into the auto-only lifestream JSONL because it requires a human action.
Implications
- Hookdeck connection for GitHub webhooks rides on the existing free 50K events/mo tier — current family commit volume is ~thousands/mo, well inside envelope.
- Two daily GH Actions cron jobs (Wakatime + CF Analytics) at
01:00 IST. Idempotent on
(date, source)— re-runs replace, not duplicate. - Health-check coverage — both daily crons ping
healthchecks.io on
success per
health-check-cron-plus-uptime; miss ? alert. - No card required across all three sources (GitHub webhooks free, Wakatime free tier, Cloudflare Web Analytics free).
- JSONL idempotency keys documented per source above so replay is safe.
- Schema lives in
chirag127/oriz-me-data/schema.json; the JSONL canonical decision covers the validation flow. - Forward refs: ingest workers under
api.oriz.in/lifestream/*belong to the umbrella Hono Worker by default; if quota mitigation per the CF Worker quota mitigation playbook warrants splittingapi.oriz.in/lifestream/*into its own Worker, do it then.
Cross-refs
- Lifestream JSONL canonical decision
- Lifestream federation (AT Protocol + ActivityPub mirrors)
- Cron split — CF Cron vs GH Actions
- Hookdeck (queue ingress)
- Wakatime service
- Cloudflare Web Analytics service
- healthchecks.io — heartbeat coverage
- Auto-only-tracking rule (forward ref — being added in parallel)
- Auto-tracking everywhere decision (forward ref — being added in parallel)
- — Toggl's manual stream is intentionally NOT a lifestream source