docs: finalize project README
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Ib2e222bfc5e5433423186584b52d53cb6a6a6964
This commit is contained in:
parent
4fba5e5ea3
commit
430219ee4c
1 changed files with 281 additions and 16 deletions
299
README.md
299
README.md
|
|
@ -1,26 +1,291 @@
|
|||
# Watchdog
|
||||
|
||||
Watchdog is a privacy-preserving web analytics platform similar to Plausible,
|
||||
where data visualisation is deferred to a Prometheus-compatible dashboard such
|
||||
as Grafana. It is designed to be as simple and lightweight as possible, without
|
||||
any additional web components to worry about. You, in turn, get to use your
|
||||
existing monitoring stack to also collect information about your web
|
||||
applications tracked by Watchdog.
|
||||
Watchdog is a lightweight, privacy-first analytics system with fully declarative
|
||||
configuration that aggregates web traffic data and exports it as Prometheus
|
||||
metrics. Unlike traditional analytics platforms, Watchdog stores no raw events,
|
||||
uses no persistent user identifiers, and enforces bounded cardinality by design.
|
||||
|
||||
## Design
|
||||
## Features
|
||||
|
||||
- No raw event storage
|
||||
- No persistent identifiers
|
||||
- Bounded cardinality by design
|
||||
- Aggregate at ingestion
|
||||
- Prometheus-native export
|
||||
Watchdog is **privacy-first** in design. There are no cookies, no `localStorage`
|
||||
or browser fingerprinting. Daily salt rotation prevents cross-day visitor
|
||||
correlation, and there is no raw event storage. We only aggregate metrics. Other
|
||||
noteworthy features:
|
||||
|
||||
- Multi-site analytics with optional domain tracking
|
||||
- Bounded cardinality prevents metric explosion
|
||||
- Rate limiting and DoS protection
|
||||
- Graceful shutdown with state persistence
|
||||
- IPv6 support with proper proxy header handling
|
||||
|
||||
See the [privacy section](#privacy) for more details on what "guarantees"
|
||||
Watchdog has to offer.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Build the prıject
|
||||
$ go build -o watchdog ./cmd/watchdog
|
||||
### NixOS
|
||||
|
||||
# Start the Watchdog daemon with your own config
|
||||
$ ./watchdog --config config.yaml
|
||||
The recommended way of using Watchdog is consuming the NixOS module. It handles
|
||||
everything from dependencies to environment setup and Systemd configuration.
|
||||
Simply import the NixOS module provided by this flake, and enable
|
||||
`services.watchdog`:
|
||||
|
||||
```nix
|
||||
{inputs, ...}: {
|
||||
imports = [inputs.watchdog.nixosModules.default];
|
||||
|
||||
services.watchdog = {
|
||||
enable = true;
|
||||
settings = {
|
||||
site.domains = ["example.com"];
|
||||
server.listen_addr = "127.0.0.1:8080";
|
||||
};
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
The `settings` option is freeform, meaning it'll be serialized to the YAML
|
||||
configuration automatically.
|
||||
|
||||
### Systemd
|
||||
|
||||
On non-NixOS distributions, you may build Watchdog with `go build` and copy it
|
||||
somewhere that's in your `PATH`. Usually this is `/usr/local/bin` for system
|
||||
installations:
|
||||
|
||||
```bash
|
||||
# Build
|
||||
$ go build -o /usr/local/bin/watchdog .
|
||||
|
||||
# Install service
|
||||
$ sudo install -Dm700 contrib/systemd/watchdog.service /etc/systemd/system/
|
||||
$ sudo systemctl daemon-reload
|
||||
$ sudo systemctl enable --now watchdog
|
||||
```
|
||||
|
||||
### Binary
|
||||
|
||||
You may also run the binary from any path by simply building and running it, but
|
||||
it is generally not very advisable. You may consider something like `screen` to
|
||||
background the Watchdog process. Though, Systemd is the only supported
|
||||
installation mechanism.
|
||||
|
||||
```bash
|
||||
# Build
|
||||
$ go build -o watchdog .
|
||||
|
||||
# Run
|
||||
$ ./watchdog -config config.yaml
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Create `config.yaml`:
|
||||
|
||||
```yaml
|
||||
site:
|
||||
# Single-site analytics
|
||||
domains:
|
||||
- "example.com"
|
||||
|
||||
# Or multi-site analytics
|
||||
# domains:
|
||||
# - "example.com"
|
||||
# - "blog.example.com"
|
||||
# - "shop.example.com"
|
||||
|
||||
salt_rotation: "daily"
|
||||
|
||||
collect:
|
||||
pageviews: true
|
||||
device: true
|
||||
referrer: "domain"
|
||||
domain: false # Set to true for multi-site analytics
|
||||
|
||||
limits:
|
||||
max_paths: 10000
|
||||
max_sources: 500
|
||||
max_custom_events: 100
|
||||
max_events_per_minute: 10000
|
||||
|
||||
server:
|
||||
listen_addr: "127.0.0.1:8080"
|
||||
```
|
||||
|
||||
See [config.example.yaml](config.example.yaml) for all options.
|
||||
|
||||
## Usage
|
||||
|
||||
### JavaScript Beacon
|
||||
|
||||
Similar to Plausible, Watchdog uses a Javascript beacon to track events. In the
|
||||
most basic case, you must add it to your site in a `<script>` tag to begin
|
||||
collecting metrics:
|
||||
|
||||
```html
|
||||
<script src="https://analytics.example.com/web/beacon.js" defer></script>
|
||||
```
|
||||
|
||||
The script beacon also supports a _variety_ of configuration options via data
|
||||
attributions, which you might adjust to your own needs. Some of them are
|
||||
described below:
|
||||
|
||||
```html
|
||||
<!-- Custom API endpoint -->
|
||||
<script src="/web/beacon.js" data-api="/custom/endpoint" defer></script>
|
||||
|
||||
<!-- Track specific domain (multi-site) -->
|
||||
<script src="/web/beacon.js" data-domain="example.com" defer></script>
|
||||
|
||||
<!-- Hash-based routing (for SPAs) -->
|
||||
<script src="/web/beacon.js" data-hash-mode defer></script>
|
||||
|
||||
<!-- Track outbound links -->
|
||||
<script src="/web/beacon.js" data-outbound-links defer></script>
|
||||
|
||||
<!-- Track file downloads (.pdf, .zip, .doc, etc.) -->
|
||||
<script src="/web/beacon.js" data-file-downloads defer></script>
|
||||
|
||||
<!-- Exclude paths (comma-separated) -->
|
||||
<script src="/web/beacon.js" data-exclude="/admin,/dashboard" defer></script>
|
||||
|
||||
<!-- Manual pageview tracking -->
|
||||
<script src="/web/beacon.js" data-manual defer></script>
|
||||
|
||||
<!-- Combine multiple options -->
|
||||
<script
|
||||
src="/web/beacon.js"
|
||||
data-hash-mode
|
||||
data-outbound-links
|
||||
data-file-downloads
|
||||
defer
|
||||
></script>
|
||||
```
|
||||
|
||||
You can also track custom events as follows:
|
||||
|
||||
```javascript
|
||||
// Simple event
|
||||
window.watchdog.track("signup");
|
||||
|
||||
// Event with custom referrer
|
||||
window.watchdog.track("purchase", { referrer: "email-campaign" });
|
||||
|
||||
// Manual pageview (when data-manual is set)
|
||||
window.watchdog.trackPageview();
|
||||
|
||||
// Force pageview (bypass duplicate detection)
|
||||
window.watchdog.trackPageview({ force: true });
|
||||
```
|
||||
|
||||
### Metrics
|
||||
|
||||
[observability documentation]: docs/observability.md
|
||||
|
||||
Unlike most common solutions, Watchdog does not do any data visualisation. It
|
||||
collects and aggregates metrics in a Prometheus-compatible manner at the
|
||||
`/metrics` endpoint. To scrape and store the time-series data, you will need
|
||||
**Prometheus**. Grafana is a common solution to _visualizing_ said data.
|
||||
|
||||
See the [observability documentation] for full setup guide. It is, however,
|
||||
recommended to be somewhat knowledgable in those areas before attempting to
|
||||
deploy. There exists a Grafana dashboard with support for multi-host and
|
||||
multi-site deployments provided in the [contrib directory](contrib/grafana/).
|
||||
|
||||
While not final, some of the metrics collected are as follows:
|
||||
|
||||
**Traffic metrics:**
|
||||
|
||||
- `web_pageviews_total{path,device,referrer,domain}` - Total pageviews
|
||||
- `domain` label only present if `site.collect.domain: true`
|
||||
- `web_custom_events_total{event}` - Custom event counts
|
||||
- `web_daily_unique_visitors` - Estimated unique visitors (HyperLogLog)
|
||||
|
||||
**Health metrics:**
|
||||
|
||||
- `web_path_overflow_total` - Paths rejected due to cardinality limit
|
||||
- `web_referrer_overflow_total` - Referrers rejected due to limit
|
||||
- `web_event_overflow_total` - Custom events rejected due to limit
|
||||
|
||||
## Privacy
|
||||
|
||||
Privacy is a fundamental design constraint for Watchdog, and not a feature. All
|
||||
personally identifiable information is discarded at ingestion. We only keep
|
||||
aggregate counts. Some features worth noting.
|
||||
|
||||
**No Persistent Identifiers:**
|
||||
|
||||
- No cookies, `localStorage`, or fingerprinting
|
||||
- IP + User-Agent hashed with daily rotating salt
|
||||
- Hash discarded after HLL insertion
|
||||
|
||||
**Data Minimization:**
|
||||
|
||||
- Only collects: domain, path, referrer, screen width
|
||||
- No raw event storage
|
||||
- All data aggregated at ingestion
|
||||
|
||||
**Bounded Cardinality:**
|
||||
|
||||
- Configurable limits on paths, referrers, events
|
||||
- Prevents unbounded metric growth
|
||||
- Overflow tracked separately
|
||||
|
||||
**GDPR/CCPA Compliance:**
|
||||
|
||||
- No personal data stored
|
||||
- Daily salt rotation prevents cross-day correlation
|
||||
- Aggregate-only analytics
|
||||
|
||||
## Contributing
|
||||
|
||||
Watchdog was built in a very short duration to address a very specific need:
|
||||
replace Plausible and replace it as soon as possible. While I've given the
|
||||
appropriate amount of care to the codebase, there may be unintended bugs or
|
||||
missing features that you'd like to support. In this case, you are very welcome
|
||||
to create issues or PRs.
|
||||
|
||||
### Building
|
||||
|
||||
The recommended way of developing Watchdog is using the Nix flake, for a
|
||||
reproducible toolchain. The default shell already provides everything you need,
|
||||
so you can simply use `nix develop` or use `direnv allow` if you use Direnv.
|
||||
|
||||
Once you have the dependencies, the workflow is relatively simple. Build and
|
||||
test with Go, and then with Nix to ensure packaging is correct.
|
||||
|
||||
```bash
|
||||
# Build binary
|
||||
$ go build -o watchdog .
|
||||
|
||||
# Run tests
|
||||
$ go test ./...
|
||||
|
||||
# Run integration tests
|
||||
$ go test -tags=integration ./test/...
|
||||
|
||||
# Build with Nix
|
||||
$ nix build
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
There's a non-negligible test suite for Watchdog that I'm quite proud of. It
|
||||
tests anything from configuration validation to path normalization and other
|
||||
security features that we'd _rather not regress_. If working on Watchdog, it is
|
||||
advisable that you add test cases and run the tests before submitting your
|
||||
changes.
|
||||
|
||||
```bash
|
||||
# Unit tests
|
||||
$ go test ./...
|
||||
|
||||
# Integration tests (requires build tag)
|
||||
$ go test -tags=integration ./test/...
|
||||
|
||||
# Benchmarks
|
||||
$ go test -bench=. ./test/...
|
||||
|
||||
# Coverage
|
||||
$ go test -cover ./...
|
||||
```
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue