docs: finalize project README
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Ib2e222bfc5e5433423186584b52d53cb6a6a6964
This commit is contained in:
parent
4fba5e5ea3
commit
430219ee4c
1 changed files with 281 additions and 16 deletions
299
README.md
299
README.md
|
|
@ -1,26 +1,291 @@
|
||||||
# Watchdog
|
# Watchdog
|
||||||
|
|
||||||
Watchdog is a privacy-preserving web analytics platform similar to Plausible,
|
Watchdog is a lightweight, privacy-first analytics system with fully declarative
|
||||||
where data visualisation is deferred to a Prometheus-compatible dashboard such
|
configuration that aggregates web traffic data and exports it as Prometheus
|
||||||
as Grafana. It is designed to be as simple and lightweight as possible, without
|
metrics. Unlike traditional analytics platforms, Watchdog stores no raw events,
|
||||||
any additional web components to worry about. You, in turn, get to use your
|
uses no persistent user identifiers, and enforces bounded cardinality by design.
|
||||||
existing monitoring stack to also collect information about your web
|
|
||||||
applications tracked by Watchdog.
|
|
||||||
|
|
||||||
## Design
|
## Features
|
||||||
|
|
||||||
- No raw event storage
|
Watchdog is **privacy-first** in design. There are no cookies, no `localStorage`
|
||||||
- No persistent identifiers
|
or browser fingerprinting. Daily salt rotation prevents cross-day visitor
|
||||||
- Bounded cardinality by design
|
correlation, and there is no raw event storage. We only aggregate metrics. Other
|
||||||
- Aggregate at ingestion
|
noteworthy features:
|
||||||
- Prometheus-native export
|
|
||||||
|
- Multi-site analytics with optional domain tracking
|
||||||
|
- Bounded cardinality prevents metric explosion
|
||||||
|
- Rate limiting and DoS protection
|
||||||
|
- Graceful shutdown with state persistence
|
||||||
|
- IPv6 support with proper proxy header handling
|
||||||
|
|
||||||
|
See the [privacy section](#privacy) for more details on what "guarantees"
|
||||||
|
Watchdog has to offer.
|
||||||
|
|
||||||
## Quick Start
|
## Quick Start
|
||||||
|
|
||||||
```bash
|
### NixOS
|
||||||
# Build the prıject
|
|
||||||
$ go build -o watchdog ./cmd/watchdog
|
|
||||||
|
|
||||||
# Start the Watchdog daemon with your own config
|
The recommended way of using Watchdog is consuming the NixOS module. It handles
|
||||||
$ ./watchdog --config config.yaml
|
everything from dependencies to environment setup and Systemd configuration.
|
||||||
|
Simply import the NixOS module provided by this flake, and enable
|
||||||
|
`services.watchdog`:
|
||||||
|
|
||||||
|
```nix
|
||||||
|
{inputs, ...}: {
|
||||||
|
imports = [inputs.watchdog.nixosModules.default];
|
||||||
|
|
||||||
|
services.watchdog = {
|
||||||
|
enable = true;
|
||||||
|
settings = {
|
||||||
|
site.domains = ["example.com"];
|
||||||
|
server.listen_addr = "127.0.0.1:8080";
|
||||||
|
};
|
||||||
|
};
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `settings` option is freeform, meaning it'll be serialized to the YAML
|
||||||
|
configuration automatically.
|
||||||
|
|
||||||
|
### Systemd
|
||||||
|
|
||||||
|
On non-NixOS distributions, you may build Watchdog with `go build` and copy it
|
||||||
|
somewhere that's in your `PATH`. Usually this is `/usr/local/bin` for system
|
||||||
|
installations:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Build
|
||||||
|
$ go build -o /usr/local/bin/watchdog .
|
||||||
|
|
||||||
|
# Install service
|
||||||
|
$ sudo install -Dm700 contrib/systemd/watchdog.service /etc/systemd/system/
|
||||||
|
$ sudo systemctl daemon-reload
|
||||||
|
$ sudo systemctl enable --now watchdog
|
||||||
|
```
|
||||||
|
|
||||||
|
### Binary
|
||||||
|
|
||||||
|
You may also run the binary from any path by simply building and running it, but
|
||||||
|
it is generally not very advisable. You may consider something like `screen` to
|
||||||
|
background the Watchdog process. Though, Systemd is the only supported
|
||||||
|
installation mechanism.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Build
|
||||||
|
$ go build -o watchdog .
|
||||||
|
|
||||||
|
# Run
|
||||||
|
$ ./watchdog -config config.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
Create `config.yaml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
site:
|
||||||
|
# Single-site analytics
|
||||||
|
domains:
|
||||||
|
- "example.com"
|
||||||
|
|
||||||
|
# Or multi-site analytics
|
||||||
|
# domains:
|
||||||
|
# - "example.com"
|
||||||
|
# - "blog.example.com"
|
||||||
|
# - "shop.example.com"
|
||||||
|
|
||||||
|
salt_rotation: "daily"
|
||||||
|
|
||||||
|
collect:
|
||||||
|
pageviews: true
|
||||||
|
device: true
|
||||||
|
referrer: "domain"
|
||||||
|
domain: false # Set to true for multi-site analytics
|
||||||
|
|
||||||
|
limits:
|
||||||
|
max_paths: 10000
|
||||||
|
max_sources: 500
|
||||||
|
max_custom_events: 100
|
||||||
|
max_events_per_minute: 10000
|
||||||
|
|
||||||
|
server:
|
||||||
|
listen_addr: "127.0.0.1:8080"
|
||||||
|
```
|
||||||
|
|
||||||
|
See [config.example.yaml](config.example.yaml) for all options.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
### JavaScript Beacon
|
||||||
|
|
||||||
|
Similar to Plausible, Watchdog uses a Javascript beacon to track events. In the
|
||||||
|
most basic case, you must add it to your site in a `<script>` tag to begin
|
||||||
|
collecting metrics:
|
||||||
|
|
||||||
|
```html
|
||||||
|
<script src="https://analytics.example.com/web/beacon.js" defer></script>
|
||||||
|
```
|
||||||
|
|
||||||
|
The script beacon also supports a _variety_ of configuration options via data
|
||||||
|
attributions, which you might adjust to your own needs. Some of them are
|
||||||
|
described below:
|
||||||
|
|
||||||
|
```html
|
||||||
|
<!-- Custom API endpoint -->
|
||||||
|
<script src="/web/beacon.js" data-api="/custom/endpoint" defer></script>
|
||||||
|
|
||||||
|
<!-- Track specific domain (multi-site) -->
|
||||||
|
<script src="/web/beacon.js" data-domain="example.com" defer></script>
|
||||||
|
|
||||||
|
<!-- Hash-based routing (for SPAs) -->
|
||||||
|
<script src="/web/beacon.js" data-hash-mode defer></script>
|
||||||
|
|
||||||
|
<!-- Track outbound links -->
|
||||||
|
<script src="/web/beacon.js" data-outbound-links defer></script>
|
||||||
|
|
||||||
|
<!-- Track file downloads (.pdf, .zip, .doc, etc.) -->
|
||||||
|
<script src="/web/beacon.js" data-file-downloads defer></script>
|
||||||
|
|
||||||
|
<!-- Exclude paths (comma-separated) -->
|
||||||
|
<script src="/web/beacon.js" data-exclude="/admin,/dashboard" defer></script>
|
||||||
|
|
||||||
|
<!-- Manual pageview tracking -->
|
||||||
|
<script src="/web/beacon.js" data-manual defer></script>
|
||||||
|
|
||||||
|
<!-- Combine multiple options -->
|
||||||
|
<script
|
||||||
|
src="/web/beacon.js"
|
||||||
|
data-hash-mode
|
||||||
|
data-outbound-links
|
||||||
|
data-file-downloads
|
||||||
|
defer
|
||||||
|
></script>
|
||||||
|
```
|
||||||
|
|
||||||
|
You can also track custom events as follows:
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
// Simple event
|
||||||
|
window.watchdog.track("signup");
|
||||||
|
|
||||||
|
// Event with custom referrer
|
||||||
|
window.watchdog.track("purchase", { referrer: "email-campaign" });
|
||||||
|
|
||||||
|
// Manual pageview (when data-manual is set)
|
||||||
|
window.watchdog.trackPageview();
|
||||||
|
|
||||||
|
// Force pageview (bypass duplicate detection)
|
||||||
|
window.watchdog.trackPageview({ force: true });
|
||||||
|
```
|
||||||
|
|
||||||
|
### Metrics
|
||||||
|
|
||||||
|
[observability documentation]: docs/observability.md
|
||||||
|
|
||||||
|
Unlike most common solutions, Watchdog does not do any data visualisation. It
|
||||||
|
collects and aggregates metrics in a Prometheus-compatible manner at the
|
||||||
|
`/metrics` endpoint. To scrape and store the time-series data, you will need
|
||||||
|
**Prometheus**. Grafana is a common solution to _visualizing_ said data.
|
||||||
|
|
||||||
|
See the [observability documentation] for full setup guide. It is, however,
|
||||||
|
recommended to be somewhat knowledgable in those areas before attempting to
|
||||||
|
deploy. There exists a Grafana dashboard with support for multi-host and
|
||||||
|
multi-site deployments provided in the [contrib directory](contrib/grafana/).
|
||||||
|
|
||||||
|
While not final, some of the metrics collected are as follows:
|
||||||
|
|
||||||
|
**Traffic metrics:**
|
||||||
|
|
||||||
|
- `web_pageviews_total{path,device,referrer,domain}` - Total pageviews
|
||||||
|
- `domain` label only present if `site.collect.domain: true`
|
||||||
|
- `web_custom_events_total{event}` - Custom event counts
|
||||||
|
- `web_daily_unique_visitors` - Estimated unique visitors (HyperLogLog)
|
||||||
|
|
||||||
|
**Health metrics:**
|
||||||
|
|
||||||
|
- `web_path_overflow_total` - Paths rejected due to cardinality limit
|
||||||
|
- `web_referrer_overflow_total` - Referrers rejected due to limit
|
||||||
|
- `web_event_overflow_total` - Custom events rejected due to limit
|
||||||
|
|
||||||
|
## Privacy
|
||||||
|
|
||||||
|
Privacy is a fundamental design constraint for Watchdog, and not a feature. All
|
||||||
|
personally identifiable information is discarded at ingestion. We only keep
|
||||||
|
aggregate counts. Some features worth noting.
|
||||||
|
|
||||||
|
**No Persistent Identifiers:**
|
||||||
|
|
||||||
|
- No cookies, `localStorage`, or fingerprinting
|
||||||
|
- IP + User-Agent hashed with daily rotating salt
|
||||||
|
- Hash discarded after HLL insertion
|
||||||
|
|
||||||
|
**Data Minimization:**
|
||||||
|
|
||||||
|
- Only collects: domain, path, referrer, screen width
|
||||||
|
- No raw event storage
|
||||||
|
- All data aggregated at ingestion
|
||||||
|
|
||||||
|
**Bounded Cardinality:**
|
||||||
|
|
||||||
|
- Configurable limits on paths, referrers, events
|
||||||
|
- Prevents unbounded metric growth
|
||||||
|
- Overflow tracked separately
|
||||||
|
|
||||||
|
**GDPR/CCPA Compliance:**
|
||||||
|
|
||||||
|
- No personal data stored
|
||||||
|
- Daily salt rotation prevents cross-day correlation
|
||||||
|
- Aggregate-only analytics
|
||||||
|
|
||||||
|
## Contributing
|
||||||
|
|
||||||
|
Watchdog was built in a very short duration to address a very specific need:
|
||||||
|
replace Plausible and replace it as soon as possible. While I've given the
|
||||||
|
appropriate amount of care to the codebase, there may be unintended bugs or
|
||||||
|
missing features that you'd like to support. In this case, you are very welcome
|
||||||
|
to create issues or PRs.
|
||||||
|
|
||||||
|
### Building
|
||||||
|
|
||||||
|
The recommended way of developing Watchdog is using the Nix flake, for a
|
||||||
|
reproducible toolchain. The default shell already provides everything you need,
|
||||||
|
so you can simply use `nix develop` or use `direnv allow` if you use Direnv.
|
||||||
|
|
||||||
|
Once you have the dependencies, the workflow is relatively simple. Build and
|
||||||
|
test with Go, and then with Nix to ensure packaging is correct.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Build binary
|
||||||
|
$ go build -o watchdog .
|
||||||
|
|
||||||
|
# Run tests
|
||||||
|
$ go test ./...
|
||||||
|
|
||||||
|
# Run integration tests
|
||||||
|
$ go test -tags=integration ./test/...
|
||||||
|
|
||||||
|
# Build with Nix
|
||||||
|
$ nix build
|
||||||
|
```
|
||||||
|
|
||||||
|
### Testing
|
||||||
|
|
||||||
|
There's a non-negligible test suite for Watchdog that I'm quite proud of. It
|
||||||
|
tests anything from configuration validation to path normalization and other
|
||||||
|
security features that we'd _rather not regress_. If working on Watchdog, it is
|
||||||
|
advisable that you add test cases and run the tests before submitting your
|
||||||
|
changes.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Unit tests
|
||||||
|
$ go test ./...
|
||||||
|
|
||||||
|
# Integration tests (requires build tag)
|
||||||
|
$ go test -tags=integration ./test/...
|
||||||
|
|
||||||
|
# Benchmarks
|
||||||
|
$ go test -bench=. ./test/...
|
||||||
|
|
||||||
|
# Coverage
|
||||||
|
$ go test -cover ./...
|
||||||
```
|
```
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue