docs: add project README
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I4aa3f5ba0d5f4967e52ce96514312e776a6a6964
This commit is contained in:
parent
f8db097ba9
commit
978cddb862
1 changed files with 464 additions and 0 deletions
464
README.md
Normal file
464
README.md
Normal file
|
|
@ -0,0 +1,464 @@
|
|||
# Troutbot
|
||||
|
||||
Troutbot is the final solution to protecting the trout population. It's
|
||||
environmental protection incarnate!
|
||||
|
||||
Well in reality, it's a GitHub webhook bot that analyzes issues and pull
|
||||
requests using real signals such as CI check results, diff quality, and body
|
||||
structure and then posts trout-themed comments about the findings. Now you know
|
||||
whether your changes hurt or help the trout population.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Install dependencies
|
||||
$ npm install
|
||||
|
||||
# Populate the environment config
|
||||
$ cp .env.example .env
|
||||
|
||||
# Set up application confg
|
||||
cp config.example.ts config.ts
|
||||
|
||||
# Edit .env and config.ts, then to start:
|
||||
npm run build && npm start
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
Troutbot has three analysis backends ran against each incoming webhook event.
|
||||
They are the primary decisionmaking logic behind whether your changes affect the
|
||||
trout population negatively, or positively.
|
||||
|
||||
### `checks`
|
||||
|
||||
Queries the GitHub Checks API for the PR's head commit. Looks at check run
|
||||
conclusions (ESLint, Clippy, Jest, cargo test, GitHub Actions, etc.) and scores
|
||||
based on pass/fail ratio. Any CI failure is a negative signal. Requires a
|
||||
`GITHUB_TOKEN`.
|
||||
|
||||
### `diff`
|
||||
|
||||
Fetches the PR's changed files via the GitHub API. Evaluates:
|
||||
|
||||
- **Size**: Small PRs (< 200 lines) are positive; large PRs (above `maxChanges`)
|
||||
are negative
|
||||
- **Focus**: Few files changed is positive; 30+ files is negative
|
||||
- **Tests**: Presence of test file changes is positive; absence when
|
||||
`requireTests` is set is negative
|
||||
- **Net deletion**: Removing more code than you add is positive. Less code is
|
||||
more good.
|
||||
|
||||
Requires a `GITHUB_TOKEN`.
|
||||
|
||||
### `quality`
|
||||
|
||||
Pure text analysis of the issue/PR description. No API calls needed. Checks for:
|
||||
|
||||
- **Issues**: Adequate description length, code blocks, reproduction steps,
|
||||
expected/actual behavior sections, environment info
|
||||
- **PRs**: Description length, linked issues (`Fixes #123`), test plan sections,
|
||||
code blocks
|
||||
- **Both**: Markdown structure/headers, references to other issues, screenshots
|
||||
|
||||
Empty or minimal descriptions are flagged as negative.
|
||||
|
||||
### Combining Results
|
||||
|
||||
Each backend returns an impact (`positive` / `negative` / `neutral`) and a
|
||||
confidence score. The engine combines them using configurable weights (default:
|
||||
checks 0.4, diff 0.3, quality 0.3). Backends that return zero confidence (e.g.,
|
||||
no CI checks found yet) are excluded from the average. If combined confidence
|
||||
falls below `confidenceThreshold`, the result is forced to neutral.
|
||||
|
||||
## GitHub Account & Token Setup
|
||||
|
||||
Troutbot is designed to run as a dedicated bot account on GitHub. Create a
|
||||
separate GitHub account for the bot (e.g., `troutbot`) so that comments are
|
||||
clearly attributed to it rather than to a personal account.
|
||||
|
||||
### 1. Create the bot account
|
||||
|
||||
Sign up for a new GitHub account at <https://github.com/signup>. Use a dedicated
|
||||
email address for the bot. Give it a recognizable username and avatar.
|
||||
|
||||
### 2. Grant repository access
|
||||
|
||||
The bot account needs access to every repository it will comment on:
|
||||
|
||||
- **For organization repos**: Invite the bot account as a collaborator with
|
||||
**Write** access, or add it to a team with write permissions.
|
||||
- **For personal repos**: Add the bot account as a collaborator under
|
||||
\*\*Settings
|
||||
> Collaborators\*\*.
|
||||
|
||||
The bot needs write access to post comments. Read access alone is not enough.
|
||||
|
||||
### 3. Generate a Personal Access Token
|
||||
|
||||
Log in as the bot account and create a fine-grained PAT:
|
||||
|
||||
1. Go to **Settings > Developer settings > Personal access tokens > Fine-grained
|
||||
tokens**
|
||||
2. Click **Generate new token**
|
||||
3. Set a descriptive name (e.g., `troutbot-webhook`)
|
||||
4. Set **Expiration** - pick a long-lived duration or no expiration, since this
|
||||
runs unattended
|
||||
5. Under **Repository access**, select the specific repositories the bot will
|
||||
operate on (or **All repositories** if it should cover everything the account
|
||||
can see)
|
||||
6. Under **Permissions > Repository permissions**, grant:
|
||||
- **Checks**: Read (for the `checks` backend to query CI results)
|
||||
- **Contents**: Read (for the `diff` backend to fetch changed files)
|
||||
- **Issues**: Read and Write (to read issue bodies and post comments)
|
||||
- **Pull requests**: Read and Write (to read PR bodies and post comments)
|
||||
- **Metadata**: Read (required by all fine-grained tokens)
|
||||
7. Click **Generate token** and copy the value
|
||||
|
||||
Set this as the `GITHUB_TOKEN` environment variable.
|
||||
|
||||
> **Classic tokens**: If you prefer a classic PAT instead, create one with the
|
||||
> `repo` scope. Fine-grained tokens are recommended because they follow the
|
||||
> principle of least privilege.
|
||||
|
||||
### 4. Generate a webhook secret
|
||||
|
||||
Generate a random secret to verify webhook payloads:
|
||||
|
||||
```bash
|
||||
openssl rand -hex 32
|
||||
```
|
||||
|
||||
Set this as the `WEBHOOK_SECRET` environment variable, and use the same value
|
||||
when configuring the webhook in GitHub (see
|
||||
[GitHub Webhook Setup](#github-webhook-setup)).
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
<!--markdownlint-disable MD013 -->
|
||||
|
||||
| Variable | Description | Required |
|
||||
| ---------------- | ----------------------------------------------------- | ---------------------------- |
|
||||
| `GITHUB_TOKEN` | Fine-grained PAT from the bot account (see above) | No (dry-run without it) |
|
||||
| `WEBHOOK_SECRET` | Secret for verifying webhook signatures | No (skips verification) |
|
||||
| `PORT` | Server port (overrides `server.port` in config) | No |
|
||||
| `CONFIG_PATH` | Path to config file | No (defaults to `config.ts`) |
|
||||
| `LOG_LEVEL` | Log level override (`debug`, `info`, `warn`, `error`) | No |
|
||||
|
||||
<!--markdownlint-enable MD013 -->
|
||||
|
||||
### Config File
|
||||
|
||||
Copy `config.example.ts` to `config.ts`. The config is a TypeScript module that
|
||||
default-exports a `Config` object - full type checking and autocompletion in
|
||||
your editor.
|
||||
|
||||
```typescript
|
||||
import type { Config } from "./src/types";
|
||||
|
||||
const config: Config = {
|
||||
server: { port: 3000 },
|
||||
engine: {
|
||||
backends: {
|
||||
checks: { enabled: true },
|
||||
diff: { enabled: true, maxChanges: 1000, requireTests: false },
|
||||
quality: { enabled: true, minBodyLength: 50 },
|
||||
},
|
||||
weights: { checks: 0.4, diff: 0.3, quality: 0.3 },
|
||||
confidenceThreshold: 0.1,
|
||||
},
|
||||
// ...
|
||||
};
|
||||
|
||||
export default config;
|
||||
```
|
||||
|
||||
The config is loaded at runtime via [jiti](https://github.com/unjs/jiti) - no
|
||||
pre-compilation needed.
|
||||
|
||||
See `config.example.ts` for the full annotated reference.
|
||||
|
||||
## GitHub Webhook Setup
|
||||
|
||||
1. Go to your repository's **Settings > Webhooks > Add webhook**
|
||||
2. **Payload URL**: `https://your-host/webhook`
|
||||
3. **Content type**: `application/json`
|
||||
4. **Secret**: Must match your `WEBHOOK_SECRET` env var
|
||||
5. **Events**: Select **Issues**, **Pull requests**, and optionally **Check
|
||||
suites** (for re-analysis when CI finishes)
|
||||
|
||||
If you enable **Check suites** and set `response.allowUpdates: true` in your
|
||||
config, troutbot will update its comment on a PR once CI results are available.
|
||||
|
||||
## Production Configuration
|
||||
|
||||
When deploying troutbot to production, keep the following in mind:
|
||||
|
||||
- **`WEBHOOK_SECRET` is strongly recommended.** Without it, anyone who can reach
|
||||
the `/webhook` endpoint can trigger analysis and post comments. Always set a
|
||||
secret and configure the same value in your GitHub webhook settings.
|
||||
- **Use a reverse proxy with TLS.** GitHub sends webhook payloads over HTTPS.
|
||||
Put nginx, Caddy, or a cloud load balancer in front of troutbot and terminate
|
||||
TLS there.
|
||||
- **Set `NODE_ENV=production`.** This is set automatically in the Docker image.
|
||||
For standalone deployments, export it in your environment. Express uses this
|
||||
to enable performance optimizations.
|
||||
- **Rate limiting** is enabled by default at 120 requests/minute on the
|
||||
`/webhook` endpoint. Override via `server.rateLimit` in your config file.
|
||||
- **Request body size** is capped at 1 MB. GitHub webhook payloads are well
|
||||
under this limit.
|
||||
- **Graceful shutdown** is built in. The server handles `SIGTERM` and `SIGINT`,
|
||||
stops accepting new connections, and waits up to 10 seconds for in-flight
|
||||
requests to finish before exiting.
|
||||
- **Dashboard access control.** The `/dashboard` and `/api/*` endpoints have no
|
||||
built-in authentication. Restrict access via reverse proxy rules, firewall, or
|
||||
binding to localhost. See [Securing the Dashboard](#securing-the-dashboard).
|
||||
|
||||
## Deployment
|
||||
|
||||
<details>
|
||||
<summary>Standalone (Node.js)</summary>
|
||||
|
||||
```bash
|
||||
npm ci
|
||||
npm run build
|
||||
export NODE_ENV=production
|
||||
export GITHUB_TOKEN="ghp_..."
|
||||
export WEBHOOK_SECRET="your-secret"
|
||||
npm start
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>Docker</summary>
|
||||
|
||||
```bash
|
||||
docker build -t troutbot .
|
||||
docker run -d \
|
||||
--name troutbot \
|
||||
-p 127.0.0.1:3000:3000 \
|
||||
-e GITHUB_TOKEN="ghp_..." \
|
||||
-e WEBHOOK_SECRET="your-secret" \
|
||||
-v $(pwd)/config.ts:/app/config.ts:ro \
|
||||
--restart unless-stopped \
|
||||
troutbot
|
||||
```
|
||||
|
||||
Multi-stage build, non-root user, built-in health check, `STOPSIGNAL SIGTERM`.
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>Docker Compose</summary>
|
||||
|
||||
```yaml
|
||||
services:
|
||||
troutbot:
|
||||
build: .
|
||||
ports:
|
||||
- "127.0.0.1:3000:3000"
|
||||
env_file: .env
|
||||
volumes:
|
||||
- ./config.ts:/app/config.ts:ro
|
||||
restart: unless-stopped
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 256M
|
||||
logging:
|
||||
driver: json-file
|
||||
options:
|
||||
max-size: "10m"
|
||||
max-file: "3"
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>systemd</summary>
|
||||
|
||||
Create `/etc/systemd/system/troutbot.service`:
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=Troutbot GitHub Webhook Bot
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=troutbot
|
||||
WorkingDirectory=/opt/troutbot
|
||||
ExecStart=/usr/bin/node dist/index.js
|
||||
EnvironmentFile=/opt/troutbot/.env
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
TimeoutStopSec=15
|
||||
NoNewPrivileges=true
|
||||
ProtectSystem=strict
|
||||
ProtectHome=true
|
||||
ReadWritePaths=/opt/troutbot
|
||||
PrivateTmp=true
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable --now troutbot
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>Reverse Proxy (nginx)</summary>
|
||||
|
||||
```nginx
|
||||
server {
|
||||
listen 443 ssl;
|
||||
server_name troutbot.example.com;
|
||||
|
||||
ssl_certificate /etc/letsencrypt/live/troutbot.example.com/fullchain.pem;
|
||||
ssl_certificate_key /etc/letsencrypt/live/troutbot.example.com/privkey.pem;
|
||||
|
||||
client_max_body_size 1m;
|
||||
proxy_read_timeout 60s;
|
||||
|
||||
location / {
|
||||
proxy_pass http://127.0.0.1:3000;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
}
|
||||
|
||||
# Optional: nginx-level rate limiting
|
||||
# limit_req_zone $binary_remote_addr zone=webhook:10m rate=10r/s;
|
||||
# location /webhook {
|
||||
# limit_req zone=webhook burst=20 nodelay;
|
||||
# proxy_pass http://127.0.0.1:3000;
|
||||
# }
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Path | Description |
|
||||
| -------- | ------------- | ---------------------------------------------------------------------------------------- |
|
||||
| `GET` | `/health` | Health check - returns `status`, `uptime` (seconds), `version`, `dryRun`, and `backends` |
|
||||
| `POST` | `/webhook` | GitHub webhook receiver (rate limited) |
|
||||
| `GET` | `/dashboard` | Web UI dashboard with status, events, and config editor |
|
||||
| `GET` | `/api/status` | JSON status: uptime, version, dry-run, backends, repo count |
|
||||
| `GET` | `/api/events` | Recent webhook events from the in-memory ring buffer |
|
||||
| `DELETE` | `/api/events` | Clear the event ring buffer |
|
||||
| `GET` | `/api/config` | Current runtime configuration as JSON |
|
||||
| `PUT` | `/api/config` | Partial config update: deep-merges, validates, and applies in-place |
|
||||
|
||||
## Dashboard & Runtime API
|
||||
|
||||
Troutbot ships with a built-in web dashboard and JSON API for monitoring and
|
||||
runtime configuration. No separate frontend build is required.
|
||||
|
||||
### Web Dashboard
|
||||
|
||||
Navigate to `http://localhost:3000/dashboard` (or wherever your instance is
|
||||
running). The dashboard provides:
|
||||
|
||||
- **Status card** - uptime, version, dry-run state, active backends, and repo
|
||||
count. Auto-refreshes every 30 seconds.
|
||||
- **Event log** - table of recent webhook events showing repo, PR/issue number,
|
||||
action, impact rating, and confidence score. Keeps the last 100 events in
|
||||
memory.
|
||||
- **Config editor** - read-only JSON view of the current runtime config with an
|
||||
"Edit" toggle that lets you modify and save changes without restarting.
|
||||
|
||||
The dashboard is a single HTML page with inline CSS and vanilla JS - no
|
||||
frameworks, no build step, no external assets.
|
||||
|
||||
### Runtime Config API
|
||||
|
||||
You can inspect and modify the running configuration via the REST API. Changes
|
||||
are applied in-place without restarting the server. The update endpoint
|
||||
deep-merges your partial config onto the current one and validates before
|
||||
applying.
|
||||
|
||||
```bash
|
||||
# Read current config
|
||||
curl http://localhost:3000/api/config
|
||||
|
||||
# Update a single setting (partial merge)
|
||||
curl -X PUT http://localhost:3000/api/config \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"response": {"allowUpdates": true}}'
|
||||
|
||||
# Change engine weights at runtime
|
||||
curl -X PUT http://localhost:3000/api/config \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"engine": {"weights": {"checks": 0.5, "diff": 0.25, "quality": 0.25}}}'
|
||||
```
|
||||
|
||||
Invalid configs are rejected with a 400 status and an error message. The
|
||||
original config remains unchanged if validation fails.
|
||||
|
||||
### Event Buffer API
|
||||
|
||||
The event buffer stores the last 100 processed webhook events in memory. Events
|
||||
are lost on restart.
|
||||
|
||||
```bash
|
||||
# List recent events
|
||||
curl http://localhost:3000/api/events
|
||||
|
||||
# Clear the buffer
|
||||
curl -X DELETE http://localhost:3000/api/events
|
||||
```
|
||||
|
||||
### Securing the Dashboard
|
||||
|
||||
The dashboard and API endpoints have no authentication by default. In
|
||||
production, restrict access using one of:
|
||||
|
||||
- **Reverse proxy rules** - limit `/dashboard` and `/api/*` to internal IPs or
|
||||
require basic auth at the nginx/Caddy layer
|
||||
- **Firewall rules** - only expose port 3000 to trusted networks
|
||||
- **Bind to localhost** - set `server.port` and bind to `127.0.0.1` (the Docker
|
||||
examples already do this), then access via SSH tunnel or VPN
|
||||
|
||||
Do not expose the dashboard to the public internet without authentication, as
|
||||
the config API allows modifying runtime behavior.
|
||||
|
||||
## Dry-Run Mode
|
||||
|
||||
Without a `GITHUB_TOKEN`, the bot runs in dry-run mode. The quality backend
|
||||
still works (text analysis), but checks and diff backends return neutral (they
|
||||
need API access). Comments are logged instead of posted.
|
||||
|
||||
## Customizing Messages
|
||||
|
||||
Edit `response.messages` in your config. Each impact category takes an array of
|
||||
strings. One is picked randomly per event.
|
||||
|
||||
```typescript
|
||||
messages: {
|
||||
positive: [
|
||||
"The trout approve of this {type}!",
|
||||
"Upstream looks clear for this {type}.",
|
||||
],
|
||||
negative: [
|
||||
"The trout are worried about this {type}.",
|
||||
],
|
||||
neutral: [
|
||||
"The trout have no opinion on this {type}.",
|
||||
],
|
||||
},
|
||||
```
|
||||
|
||||
Placeholders:
|
||||
|
||||
- `{type}` - `issue` or `pull request`
|
||||
- `{impact}` - `positive`, `negative`, or `neutral`
|
||||
Loading…
Add table
Add a link
Reference in a new issue