Analytics — tool evaluation
Hugo needs both website analytics for the marketing/landing surface and granular product analytics inside the apps (fund setup, side-letter, MFN, Word add-in). This page compares the two self-hostable candidates we are currently considering: Umami and PostHog.
Requirements
- Self-hosted only — no third-party SaaS dashboards, no data egress out of EU.
- Privacy-first defaults (cookie-free or first-party-cookie, GDPR-friendly).
- Marketing-site analytics for
hugo.nordiclawfirm.com,mvp.hugo…,full.hugo…,review.hugo…anddev.hugo…. - In-app event tracking with custom properties and a stable user identity (logged-in users have a user id).
- Operable by a small team (single-DB preferred, ClickHouse only if we get real value from it).
Candidates at a glance
| Umami | PostHog (self-hosted) | |
|---|---|---|
| License | MIT | MIT / PostHog license (source-available, free to self-host unless resold as a service) |
| Stack | Next.js + PostgreSQL (single DB) | Django + Node + PostgreSQL + ClickHouse + Kafka + Redis + MinIO/S3 |
| Ops footprint | One container, one Postgres | Real cluster (Helm chart, several stateful services) |
| Pageviews / sessions | Yes | Yes |
| Custom events + props | Yes (event name ≤ 50 chars) | Yes (no practical limit) |
| Stable user id | umami.identify(userId, traits) | posthog.identify(distinctId, props) |
| Funnels / retention / cohorts | Yes (built in) | Yes |
| Session replay (rrweb) | No (only event timelines) | Yes |
| Feature flags / experiments | No | Yes |
| Heatmaps | No | Yes (via toolbar) |
| Error tracking | No | Yes |
| Server-side ingest | POST /api/send, no key | POST /capture, project API key |
| Per-event prop cap | No documented hard cap | No practical cap |
| Object storage | n/a | Required for recordings (R2 fits well) |
Umami — what it does well
- One Postgres + one Node app. Trivial to run alongside other Cloudflare-adjacent services on a small VPS or on Fly.io / Railway / a single VM.
- MIT license, including the tracker — no AGPL contagion concerns for anything we wrap around it.
- Has the product-analytics primitives we actually want at this stage: custom events with arbitrary properties, funnels, retention, journeys, segments, UTM attribution, identified users.
- Lightweight tracker (~2 KB), no cookies by default.
What Umami does NOT do
- No rrweb-style session replay (the "Replays" view is an event timeline for one session, not a DOM recording).
- No feature flags, no A/B test framework, no heatmaps, no error tracking.
PostHog — what it does well
- Genuine product-analytics suite: events, funnels, retention, paths, cohorts, dashboards.
- Session replay via rrweb. DOM-level playback, console + network capture, masking controls (
ph-no-captureCSS class, default input masking, custom maskers). - Feature flags, multivariate experiments, surveys, error tracking — replaces 3–4 separate tools.
- Strong masking story is important for Hugo: fund names, investor names and document contents are sensitive. We would default-mask all text and selectively unmask chrome.
What PostHog costs us
- Self-host is a real cluster: ClickHouse + Kafka + Postgres + Redis + MinIO/S3 (or R2). Helm chart exists but the operational tax is real.
- Recording volume drives storage cost — sampling, minimum session duration, and trigger-based recording must be turned on day one.
- Heavier client SDK and more in-page network activity than Umami.
Recommendation
Other tools we considered and skipped
- Plausible CE — AGPL server, Elixir + Postgres + ClickHouse, funnels gated behind cloud-only paywall in CE; no first-class user identity.
- Matomo — featureful but dated, PHP/MySQL, many "modern" features are paid plugins.
- Pirsch — Go + Postgres, AGPL; smaller community, no replay.
- Rybbit — too young to bet on (2025).
- GoatCounter / Shynet — too minimal for in-app product analytics.
- DIY on Cloudflare Workers Analytics Engine — viable for a fixed set of well-defined events, but no dashboard; not a substitute for a real analytics product.
Deployment plan — Umami
Where it runs
On the existing shared VPS (the box that already hosts the Huginn bridge, Playwright MCP, the cloudflared tunnel that fronts ssh.axhl.io, heimdall.axhl.io, and thor.axhl.io). The VPS is firewalled to deny all inbound except SSH on :22; every other service binds to 127.0.0.1 and is published to the internet only via the named Cloudflare Tunnel. Umami fits this pattern exactly.
Containers
Docker Compose under /srv/umami/, owned root:docker mode 2775 so both axel and morten can drive it once morten is added to the docker group. Two services:
umamifromghcr.io/umami-software/umami:postgresql-<pinned-tag>, bound to127.0.0.1:3000.restart: unless-stopped. Healthcheck on/api/heartbeat.postgres:16with a named volumeumami-pg-data. No host port. Only reachable from theumamiservice on the compose-internal network.
Secrets (DATABASE_URL, APP_SECRET, HASH_SALT) live in /srv/umami/.env mode 640 owned root:docker. TRACKER_SCRIPT_NAME=h.js renames the tracker so generic ad-block lists don't match.
Ingress
One new entry in /etc/cloudflared/config.yml:
- hostname: stats.hugo.nordiclawfirm.com
service: http://localhost:3000
Then sudo systemctl reload cloudflared and add the DNS record + tunnel route in the Cloudflare dashboard. No port opens on the VPS, no TLS termination on the VPS, no Caddy/nginx.
Access policy
One Cloudflare Access app on stats.hugo.nordiclawfirm.com with path-scoped rules:
/script.js,/h.js,/api/send,/api/collect→ bypass (public).- Everything else (dashboards,
/login, settings) → same email policy asdev.hugo.nordiclawfirm.com, plus the existing service-token bypass for automation.
Backups
Nightly pg_dump via a tiny cron job inside the compose stack, writing gzipped dumps to /srv/umami/backups/ with 14-day rotation. Off-host copies can come later (R2 sync) if we decide retention matters.
Data plane — Worker proxy (recommended)
End-user browsers do not talk to stats.hugo.nordiclawfirm.com directly. Each Hugo deployment's existing Cloudflare Worker gains a small /_a/* route group that fronts Umami's data plane. The browser sees same-origin requests; only the Worker talks to the VPS.
Why
- First-party tracker URL. Hugo's users sit inside law-firm corporate networks with aggressive URL filters. Requests to
stats.*hostnames and well-known analytics script names get blocked silently. Same-origin requests tomvp.hugo.nordiclawfirm.com/_a/h.jsare indistinguishable from any other Hugo asset and survive those filters. - No CORS preflight per pageview — same-origin POSTs skip the
OPTIONSround-trip. - Edge caching of the tracker via
caches.defaultwith a 24h TTL; the VPS only sees a cache-fill request roughly once per CF colo per day. - Fire-and-forget ingest. The Worker accepts the event POST, returns
202to the browser immediately, and usesctx.waitUntil(fetch(...))to forward to Umami. Browser latency is independent of VPS/tunnel state. - Server-side enrichment. The Worker attaches
CF-Connecting-IP,CF-IPCountry, and (post-auth) the logged-in user id from the session cookie before forwarding. Cookies are stripped from the upstream request so Umami never sees Hugo's session. - Kill switch. A Worker env flag flips analytics into pass-through-empty without redeploying Hugo or touching the script tag.
Routes added to each Hugo Worker
GET /_a/h.js→ fetchscript.jsfrom Umami (cached), rewrite/api/sendto/_a/api/send, serve withCache-Control: public, max-age=86400, immutable.POST /_a/api/send→ strip cookies, attachUser-Agent+CF-Connecting-IP,ctx.waitUntilthe upstream POST, respond202.
HTML in the Hugo shell
<script defer src="/_a/h.js"
data-website-id="${env.UMAMI_WEBSITE_ID}"></script>
Relative path. Each Hugo deployment carries its own UMAMI_WEBSITE_ID in wrangler.toml; Umami holds one website entry per deployment.
Latency budget
Tracker script: edge-cached, ~0 ms after first warm-up. Event POST: same-origin (no preflight) + waitUntil means the browser sees ~5–15 ms regardless of how the tunnel is doing. Worker fetch to the VPS happens off the browser's critical path.
Failure modes
- VPS down → Worker can't fill cache; tracker keeps serving from edge cache for the TTL window; events queue in
waitUntiland drop on upstream failure.window.umami?.guards no-op in app code. Hugo unaffected. - Worker bug → flip the env flag to disable the proxy; the script tag still loads (the Worker can return an empty
200) and app code no-ops via the optional chain. - Ad-blocker still bypasses → server-side
trackServer()from Workers covers the business-critical events that must not be missed (fund-creation, ISL upload, MFN evaluation).
Surfaces to track
hugo.nordiclawfirm.com(landing)mvp.hugo.nordiclawfirm.com(production app)full.hugo.nordiclawfirm.com(pre-slim feature surface)review.hugo.nordiclawfirm.com(lawyer review)dev.hugo.nordiclawfirm.com(this site)
Code that lands in Hugo
- One
UMAMI_WEBSITE_IDenv var perwrangler.toml. - Two route handlers on each Worker for
/_a/h.jsand/_a/api/send(~40 LOC of shared TS). - One
<script defer src="/_a/h.js" ...>tag in each shell layout. src/lib/analytics.ts— browsertrackEvent(name, props)+identify(userId, traits)wrappers aroundwindow.umami.src/lib/analytics-server.ts— Worker-sidetrackServer(c, name, props)usingctx.waitUntil(fetch(...))against/_a/api/sendon the same hostname.- Initial set of ~12 instrumented events (fund.created, isl.uploaded, mfn.benefit.evaluated, …); grow from there.
Open questions
- Retention policy and Postgres TTL/partitioning inside Umami.
- Whether to also forward selected events from the Worker into Cloudflare Analytics Engine as a redundant store (cheap; gives us SQL access independent of the VPS).
- How to share the seeded Umami admin password between
axelandmorten(1Password vault vs ad-hoc). - Whether the landing worker should track separately or share
UMAMI_WEBSITE_IDwith mvp.