diff --git a/README.md b/README.md index ce5a28c..d5c9111 100644 --- a/README.md +++ b/README.md @@ -1,26 +1,291 @@ # Watchdog -Watchdog is a privacy-preserving web analytics platform similar to Plausible, -where data visualisation is deferred to a Prometheus-compatible dashboard such -as Grafana. It is designed to be as simple and lightweight as possible, without -any additional web components to worry about. You, in turn, get to use your -existing monitoring stack to also collect information about your web -applications tracked by Watchdog. +Watchdog is a lightweight, privacy-first analytics system with fully declarative +configuration that aggregates web traffic data and exports it as Prometheus +metrics. Unlike traditional analytics platforms, Watchdog stores no raw events, +uses no persistent user identifiers, and enforces bounded cardinality by design. -## Design +## Features -- No raw event storage -- No persistent identifiers -- Bounded cardinality by design -- Aggregate at ingestion -- Prometheus-native export +Watchdog is **privacy-first** in design. There are no cookies, no `localStorage` +or browser fingerprinting. Daily salt rotation prevents cross-day visitor +correlation, and there is no raw event storage. We only aggregate metrics. Other +noteworthy features: + +- Multi-site analytics with optional domain tracking +- Bounded cardinality prevents metric explosion +- Rate limiting and DoS protection +- Graceful shutdown with state persistence +- IPv6 support with proper proxy header handling + +See the [privacy section](#privacy) for more details on what "guarantees" +Watchdog has to offer. ## Quick Start -```bash -# Build the prıject -$ go build -o watchdog ./cmd/watchdog +### NixOS -# Start the Watchdog daemon with your own config -$ ./watchdog --config config.yaml +The recommended way of using Watchdog is consuming the NixOS module. It handles +everything from dependencies to environment setup and Systemd configuration. +Simply import the NixOS module provided by this flake, and enable +`services.watchdog`: + +```nix +{inputs, ...}: { + imports = [inputs.watchdog.nixosModules.default]; + + services.watchdog = { + enable = true; + settings = { + site.domains = ["example.com"]; + server.listen_addr = "127.0.0.1:8080"; + }; + }; +} +``` + +The `settings` option is freeform, meaning it'll be serialized to the YAML +configuration automatically. + +### Systemd + +On non-NixOS distributions, you may build Watchdog with `go build` and copy it +somewhere that's in your `PATH`. Usually this is `/usr/local/bin` for system +installations: + +```bash +# Build +$ go build -o /usr/local/bin/watchdog . + +# Install service +$ sudo install -Dm700 contrib/systemd/watchdog.service /etc/systemd/system/ +$ sudo systemctl daemon-reload +$ sudo systemctl enable --now watchdog +``` + +### Binary + +You may also run the binary from any path by simply building and running it, but +it is generally not very advisable. You may consider something like `screen` to +background the Watchdog process. Though, Systemd is the only supported +installation mechanism. + +```bash +# Build +$ go build -o watchdog . + +# Run +$ ./watchdog -config config.yaml +``` + +## Configuration + +Create `config.yaml`: + +```yaml +site: + # Single-site analytics + domains: + - "example.com" + + # Or multi-site analytics + # domains: + # - "example.com" + # - "blog.example.com" + # - "shop.example.com" + + salt_rotation: "daily" + + collect: + pageviews: true + device: true + referrer: "domain" + domain: false # Set to true for multi-site analytics + +limits: + max_paths: 10000 + max_sources: 500 + max_custom_events: 100 + max_events_per_minute: 10000 + +server: + listen_addr: "127.0.0.1:8080" +``` + +See [config.example.yaml](config.example.yaml) for all options. + +## Usage + +### JavaScript Beacon + +Similar to Plausible, Watchdog uses a Javascript beacon to track events. In the +most basic case, you must add it to your site in a ` +``` + +The script beacon also supports a _variety_ of configuration options via data +attributions, which you might adjust to your own needs. Some of them are +described below: + +```html + + + + + + + + + + + + + + + + + + + + + + + +``` + +You can also track custom events as follows: + +```javascript +// Simple event +window.watchdog.track("signup"); + +// Event with custom referrer +window.watchdog.track("purchase", { referrer: "email-campaign" }); + +// Manual pageview (when data-manual is set) +window.watchdog.trackPageview(); + +// Force pageview (bypass duplicate detection) +window.watchdog.trackPageview({ force: true }); +``` + +### Metrics + +[observability documentation]: docs/observability.md + +Unlike most common solutions, Watchdog does not do any data visualisation. It +collects and aggregates metrics in a Prometheus-compatible manner at the +`/metrics` endpoint. To scrape and store the time-series data, you will need +**Prometheus**. Grafana is a common solution to _visualizing_ said data. + +See the [observability documentation] for full setup guide. It is, however, +recommended to be somewhat knowledgable in those areas before attempting to +deploy. There exists a Grafana dashboard with support for multi-host and +multi-site deployments provided in the [contrib directory](contrib/grafana/). + +While not final, some of the metrics collected are as follows: + +**Traffic metrics:** + +- `web_pageviews_total{path,device,referrer,domain}` - Total pageviews + - `domain` label only present if `site.collect.domain: true` +- `web_custom_events_total{event}` - Custom event counts +- `web_daily_unique_visitors` - Estimated unique visitors (HyperLogLog) + +**Health metrics:** + +- `web_path_overflow_total` - Paths rejected due to cardinality limit +- `web_referrer_overflow_total` - Referrers rejected due to limit +- `web_event_overflow_total` - Custom events rejected due to limit + +## Privacy + +Privacy is a fundamental design constraint for Watchdog, and not a feature. All +personally identifiable information is discarded at ingestion. We only keep +aggregate counts. Some features worth noting. + +**No Persistent Identifiers:** + +- No cookies, `localStorage`, or fingerprinting +- IP + User-Agent hashed with daily rotating salt +- Hash discarded after HLL insertion + +**Data Minimization:** + +- Only collects: domain, path, referrer, screen width +- No raw event storage +- All data aggregated at ingestion + +**Bounded Cardinality:** + +- Configurable limits on paths, referrers, events +- Prevents unbounded metric growth +- Overflow tracked separately + +**GDPR/CCPA Compliance:** + +- No personal data stored +- Daily salt rotation prevents cross-day correlation +- Aggregate-only analytics + +## Contributing + +Watchdog was built in a very short duration to address a very specific need: +replace Plausible and replace it as soon as possible. While I've given the +appropriate amount of care to the codebase, there may be unintended bugs or +missing features that you'd like to support. In this case, you are very welcome +to create issues or PRs. + +### Building + +The recommended way of developing Watchdog is using the Nix flake, for a +reproducible toolchain. The default shell already provides everything you need, +so you can simply use `nix develop` or use `direnv allow` if you use Direnv. + +Once you have the dependencies, the workflow is relatively simple. Build and +test with Go, and then with Nix to ensure packaging is correct. + +```bash +# Build binary +$ go build -o watchdog . + +# Run tests +$ go test ./... + +# Run integration tests +$ go test -tags=integration ./test/... + +# Build with Nix +$ nix build +``` + +### Testing + +There's a non-negligible test suite for Watchdog that I'm quite proud of. It +tests anything from configuration validation to path normalization and other +security features that we'd _rather not regress_. If working on Watchdog, it is +advisable that you add test cases and run the tests before submitting your +changes. + +```bash +# Unit tests +$ go test ./... + +# Integration tests (requires build tag) +$ go test -tags=integration ./test/... + +# Benchmarks +$ go test -bench=. ./test/... + +# Coverage +$ go test -cover ./... ```