- Go 87.6%
- Nix 6%
- JavaScript 4.7%
- HTML 1.7%
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Ia75cd29276e4eed2d57a37e02281d48b6a6a6964 |
||
|---|---|---|
| cmd/watchdog | ||
| contrib/systemd | ||
| internal | ||
| nix | ||
| test | ||
| testdata | ||
| web | ||
| .envrc | ||
| .gitignore | ||
| config.example.yaml | ||
| flake.lock | ||
| flake.nix | ||
| go.mod | ||
| go.sum | ||
| main.go | ||
| README.md | ||
Watchdog
Watchdog is a lightweight, privacy-first analytics system with fully declarative configuration that aggregates web traffic data and exports it as Prometheus metrics. Unlike traditional analytics platforms, Watchdog stores no raw events, uses no persistent user identifiers, and enforces bounded cardinality by design.
Features
Watchdog is privacy-first in design. There are no cookies, no localStorage
or browser fingerprinting. Daily salt rotation prevents cross-day visitor
correlation, and there is no raw event storage. We only aggregate metrics. Other
noteworthy features:
- Multi-site analytics with optional domain tracking
- Bounded cardinality prevents metric explosion
- Rate limiting and DoS protection
- Graceful shutdown with state persistence
- IPv6 support with proper proxy header handling
See the privacy section for more details on what "guarantees" Watchdog has to offer.
Quick Start
NixOS
The recommended way of using Watchdog is consuming the NixOS module. It handles
everything from dependencies to environment setup and Systemd configuration.
Simply import the NixOS module provided by this flake, and enable
services.watchdog:
{inputs, ...}: {
imports = [inputs.watchdog.nixosModules.default];
services.watchdog = {
enable = true;
settings = {
site.domains = ["example.com"];
server.listen_addr = "127.0.0.1:8080";
};
};
}
The settings option is freeform, meaning it'll be serialized to the YAML
configuration automatically.
Systemd
On non-NixOS distributions, you may build Watchdog with go build and copy it
somewhere that's in your PATH. Usually this is /usr/local/bin for system
installations:
# Build
$ go build -o /usr/local/bin/watchdog .
# Install service
$ sudo install -Dm700 contrib/systemd/watchdog.service /etc/systemd/system/
$ sudo systemctl daemon-reload
$ sudo systemctl enable --now watchdog
Binary
You may also run the binary from any path by simply building and running it, but
it is generally not very advisable. You may consider something like screen to
background the Watchdog process. Though, Systemd is the only supported
installation mechanism.
# Build
$ go build -o watchdog .
# Run
$ ./watchdog -config config.yaml
Configuration
Watchdog currently supports configuration via YAML file, environment variables or command-line flags. You may find a more complete reference in the configuration reference document.
Quick Start
Create config.yaml:
site:
# Single-site analytics
domains:
- "example.com"
# Or multi-site analytics
# domains:
# - "example.com"
# - "blog.example.com"
# - "shop.example.com"
salt_rotation: "daily"
collect:
pageviews: true
device: true
referrer: "domain"
domain: false # Set to true for multi-site analytics
limits:
max_paths: 10000
max_sources: 500
max_custom_events: 100
max_events_per_minute: 10000
server:
listen_addr: "127.0.0.1:8080"
Usage
JavaScript Beacon
Similar to Plausible, Watchdog uses a Javascript beacon to track events. In the
most basic case, you must add it to your site in a <script> tag to begin
collecting metrics:
<script src="https://analytics.example.com/web/beacon.js" defer></script>
The script beacon also supports a variety of configuration options via data attributions, which you might adjust to your own needs. Some of them are described below:
<!-- Custom API endpoint -->
<script src="/web/beacon.js" data-api="/custom/endpoint" defer></script>
<!-- Track specific domain (multi-site) -->
<script src="/web/beacon.js" data-domain="example.com" defer></script>
<!-- Hash-based routing (for SPAs) -->
<script src="/web/beacon.js" data-hash-mode defer></script>
<!-- Track outbound links -->
<script src="/web/beacon.js" data-outbound-links defer></script>
<!-- Track file downloads (.pdf, .zip, .doc, etc.) -->
<script src="/web/beacon.js" data-file-downloads defer></script>
<!-- Exclude paths (comma-separated) -->
<script src="/web/beacon.js" data-exclude="/admin,/dashboard" defer></script>
<!-- Manual pageview tracking -->
<script src="/web/beacon.js" data-manual defer></script>
<!-- Combine multiple options -->
<script
src="/web/beacon.js"
data-hash-mode
data-outbound-links
data-file-downloads
defer
></script>
You can also track custom events as follows:
// Simple event
window.watchdog.track("signup");
// Event with custom referrer
window.watchdog.track("purchase", { referrer: "email-campaign" });
// Manual pageview (when data-manual is set)
window.watchdog.trackPageview();
// Force pageview (bypass duplicate detection)
window.watchdog.trackPageview({ force: true });
Metrics
Unlike most common solutions, Watchdog does not do any data visualisation. It
collects and aggregates metrics in a Prometheus-compatible manner at the
/metrics endpoint. To scrape and store the time-series data, you will need
Prometheus. Grafana is a common solution to visualizing said data.
See the observability documentation for full setup guide. It is, however, recommended to be somewhat knowledgable in those areas before attempting to deploy. There exists a Grafana dashboard with support for multi-host and multi-site deployments provided in the contrib directory.
While not final, some of the metrics collected are as follows:
Traffic metrics:
web_pageviews_total{path,device,referrer,domain}- Total pageviewsdomainlabel only present ifsite.collect.domain: true
web_custom_events_total{event}- Custom event countsweb_daily_unique_visitors- Estimated unique visitors (HyperLogLog)
Health metrics:
web_path_overflow_total- Paths rejected due to cardinality limitweb_referrer_overflow_total- Referrers rejected due to limitweb_event_overflow_total- Custom events rejected due to limit
Privacy
Privacy is a fundamental design constraint for Watchdog, and not a feature. All personally identifiable information is discarded at ingestion. We only keep aggregate counts. Some features worth noting.
No Persistent Identifiers:
- No cookies,
localStorage, or fingerprinting - IP + User-Agent hashed with daily rotating salt
- Hash discarded after HLL insertion
Data Minimization:
- Only collects: domain, path, referrer, screen width
- No raw event storage
- All data aggregated at ingestion
Bounded Cardinality:
- Configurable limits on paths, referrers, events
- Prevents unbounded metric growth
- Overflow tracked separately
GDPR/CCPA Compliance:
- No personal data stored
- Daily salt rotation prevents cross-day correlation
- Aggregate-only analytics
Contributing
Watchdog was built in a very short duration to address a very specific need: replace Plausible and replace it as soon as possible. While I've given the appropriate amount of care to the codebase, there may be unintended bugs or missing features that you'd like to support. In this case, you are very welcome to create issues or PRs.
Building
The recommended way of developing Watchdog is using the Nix flake, for a
reproducible toolchain. The default shell already provides everything you need,
so you can simply use nix develop or use direnv allow if you use Direnv.
Once you have the dependencies, the workflow is relatively simple. Build and test with Go, and then with Nix to ensure packaging is correct.
# Build binary
$ go build -o watchdog .
# Run tests
$ go test ./...
# Run integration tests
$ go test -tags=integration ./test/...
# Build with Nix
$ nix build
Testing
There's a non-negligible test suite for Watchdog that I'm quite proud of. It tests anything from configuration validation to path normalization and other security features that we'd rather not regress. If working on Watchdog, it is advisable that you add test cases and run the tests before submitting your changes.
# Unit tests
$ go test ./...
# Integration tests (requires build tag)
$ go test -tags=integration ./test/...
# Benchmarks
$ go test -bench=. ./test/...
# Coverage
$ go test -cover ./...