API Documentation
Everything you need to integrate with labwatch. The API uses JSON for request/response bodies and standard HTTP status codes.
Overview
The labwatch API is organized around REST. All endpoints accept and return JSON (except where noted). The base URL for all API requests is your labwatch server instance.
Rate limits: the signup endpoint is limited to 5 requests per hour per IP. Agent ingest frequency is configured in the agent (default: 60 seconds).
Running your own server? See Self-Hosted labwatch for the docker-compose quickstart, environment reference, and agent config pointing at a local instance.
Authentication
labwatch uses two authentication methods depending on the endpoint:
Agent Bearer Token
Agents authenticate with a Bearer token received during registration. Include it in the Authorization header:
Admin Secret
Admin endpoints require the X-Admin-Secret header. This is set during server setup and gives full access to all data and configuration. You can also pass it as a ?secret= query parameter on GET endpoints where convenient.
Session Cookie
Endpoints under /api/v1/my/* are used by the dashboard UI and authenticate with a signed session cookie set by POST /login. They are not intended for server-to-server use — prefer the admin-secret endpoints for integrations.
MCP Token (roadmap)
MCP integration is on the roadmap. See the MCP section for details.
Quick Start
Get monitoring running in under 2 minutes. The signup response includes your API token and a config snippet — paste them into /etc/labwatch/config.yaml and start the service.
1. Sign up for a free account
curl -X POST https://labwatch.zazastation.duckdns.org/api/v1/signup \
-H 'Content-Type: application/json' \
-d '{"email": "you@example.com", "hostname": "my-server"}'The response includes a lab_id, a bearer token, and a ready-to-run install_command.
2. Install the agent
Copy the install_command from the signup response and run it on the machine you want to monitor:
curl -fsSL https://labwatch.zazastation.duckdns.org/install.sh | sudo bashThen paste the config_snippet into /etc/labwatch/config.yaml and start the service:
sudo systemctl enable --now labwatch3. That's it
Metrics start flowing within 60 seconds. Open your dashboard at https://labwatch.zazastation.duckdns.org/my/dashboard to see them.
curl -fsSL https://labwatch.zazastation.duckdns.org/install.sh | sudo bash -s uninstall removes the agent, service unit, and config directory.
Signup
Create a new free-tier account. Returns credentials for the admin dashboard and a pre-registered first node.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
| string | required | Your email address | |
| hostname | string | optional | Name for your first node (default: "my-server") |
Request Body (optional fields)
| password | string | optional | Set a password to enable web login (min 8 chars). Without it you can still query via the token. |
| accept_terms | boolean | required | Must be true to accept the terms of service. |
Example
curl -X POST https://your-server/api/v1/signup \
-H 'Content-Type: application/json' \
-d '{"email": "admin@lab.local", "hostname": "proxmox-01"}'Response 200
{
"lab_id": "31738bc8-df1e-46f9-9c83-d33f0b16eda9",
"token": "6ddf02b7418585cb97b6829d368dfcd6aca04bcc...",
"install_command": "curl -fsSL https://your-server/install.sh | sudo bash",
"config_snippet": "api_endpoint: \"https://your-server/api/v1\"\ntoken: \"...\"\nlab_id: \"...\"\ninterval: 60s\ndocker:\n enabled: true\n socket: /var/run/docker.sock",
"next_steps": [
"Run: curl -fsSL https://your-server/install.sh | sudo bash",
"Save config to /etc/labwatch/config.yaml",
"Run: sudo systemctl enable --now labwatch",
"View your dashboard at https://your-server/my/dashboard"
]
}Errors
| Code | Meaning |
|---|---|
| 400 | Invalid email, short password, or free-tier cap reached (3 nodes) |
| 429 | Too many signups from this IP — rate limited |
Register Agent
Register a new node for monitoring. The agent calls this on first startup to get its credentials.
Headers
| Header | Value |
|---|---|
| X-Admin-Secret | Your admin secret |
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
| hostname | string | required | Hostname of the node |
| os | string | optional | Operating system (e.g., "linux") |
| arch | string | optional | Architecture (e.g., "amd64", "arm64") |
| agent_version | string | optional | Agent version string |
Response 200
{
"lab_id": "lab_proxmox-01_x7k2",
"token": "lw_...",
"message": "Registered successfully"
}Ingest Metrics
Submit a metrics snapshot from a monitoring agent. The agent sends this every 60 seconds (configurable). Metrics are stored and processed for alerts, digests, and queries.
Headers
| Header | Value |
|---|---|
| Authorization | Bearer <token> |
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
| lab_id | string | required | Your lab identifier |
| timestamp | string | optional | ISO 8601 timestamp (defaults to server time) |
| collectors | object | required | Collector data keyed by type |
Collector Types
The collectors object can contain the following keys:
| Key | Description |
|---|---|
| system | CPU, memory, disk, network, disk I/O (read/write bytes, IOPS), load, uptime, temperatures |
| docker | Container list with health, restarts, CPU/memory per container |
| services | HTTP/TCP health check results with response times |
| gpu | GPU utilization, memory, temperature per device |
| smart | S.M.A.R.T. disk health, temperature, reallocated sectors, power-on hours |
| zfs | ZFS pool health, capacity, fragmentation, scrub status, error counts |
Example
curl -X POST https://your-server/api/v1/ingest \
-H 'Authorization: Bearer lw_abc123...' \
-H 'Content-Type: application/json' \
-d '{
"lab_id": "lab_proxmox-01_x7k2",
"collectors": {
"system": {
"data": {
"cpu": {"total_percent": 23.5, "count": 8},
"memory": {"used_percent": 62.1, "total_bytes": 34359738368},
"disk": [{"mountpoint": "/", "used_percent": 45.2}],
"load_average": {"load1": 1.2, "load5": 0.8, "load15": 0.6},
"uptime_seconds": 864000
}
},
"docker": {
"data": {
"containers": [
{"name": "caddy", "state": "running", "health": "healthy",
"cpu_percent": 0.5, "memory_mb": 32}
]
}
}
}
}'Response 200
{
"status": "accepted",
"lab_id": "lab_proxmox-01_x7k2",
"stored_types": ["system", "docker"],
"alerts_generated": 0,
"alerts": []
}Agent Status
Submit log entries collected by the agent (journald, Docker, file tails). Max 100 entries per request. Level filter and batching are handled agent-side.
Headers
| Header | Value |
|---|---|
| Authorization | Bearer <token> |
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
| entries | array | required | Array of log entry objects (max 100) |
Log Entry Fields
| Field | Type | Required | Description |
|---|---|---|---|
| ts | string | required | ISO 8601 timestamp |
| source | string | required | Log source: "journald", "docker:container_name", "file:/path" |
| level | string | required | debug, info, warn, or error |
| message | string | required | Log message (max 4096 chars) |
| unit | string | optional | systemd unit name (journald only) |
Example
curl -X POST https://your-server/api/v1/lab/lab_abc123/logs \
-H 'Authorization: Bearer lw_abc123...' \
-H 'Content-Type: application/json' \
-d '{
"entries": [
{"ts": "2026-04-29T16:00:00Z", "source": "journald", "level": "error",
"message": "OOM killer invoked", "unit": "nginx.service"},
{"ts": "2026-04-29T16:00:01Z", "source": "docker:postgres", "level": "warn",
"message": "checkpoint request too frequent"}
]
}'Response
{"status": "accepted", "lab_id": "lab_abc123", "stored": 2}Agent Configuration
Enable log collection in /etc/labwatch/config.yaml:
logs:
enabled: true
journald: true
docker: true
level_filter: warn # minimum level to ship
max_lines_per_push: 100Retention
| Plan | Retention |
|---|---|
| Free | 24 hours |
| Pro | 7 days |
| Business | 30 days |
Returns the current status of a monitored node, including latest metrics and active alerts.
Path Parameters
| Param | Type | Description |
|---|---|---|
| lab_id | string | The lab identifier |
Response 200
{
"lab_id": "lab_proxmox-01_x7k2",
"hostname": "proxmox-01",
"last_seen": "2026-03-19T22:30:00",
"cpu_percent": 23.5,
"memory_percent": 62.1,
"disk_percent": 45.2,
"container_count": 12,
"alerts": []
}List Labs
Returns a list of all registered labs with their current online status and latest metrics summary.
Response 200
{
"labs": [
{
"lab_id": "lab_proxmox-01_x7k2",
"hostname": "proxmox-01",
"last_seen": "2026-03-19T22:30:00",
"online": true,
"cpu_percent": 23.5,
"memory_percent": 62.1,
"disk_percent": 45.2
}
]
}Returns all labs with full agent metadata (OS, architecture, agent version, registration timestamp) in addition to the current metrics summary. Use this when you need provisioning-level detail; use /api/v1/admin/labs for a leaner response.
Response 200
{
"labs": [
{
"lab_id": "lab_proxmox-01_x7k2",
"hostname": "proxmox-01",
"os": "linux",
"arch": "amd64",
"agent_version": "0.4.2",
"registered_at": "2026-02-14T09:12:03",
"last_seen": "2026-03-19T22:30:00",
"online": true,
"cpu_percent": 23.5,
"memory_percent": 62.1,
"disk_percent": 45.2
}
],
"total": 1
}Admin Dashboard
Returns enriched per-lab data plus a fleet-wide alert total — the snapshot used by the admin dashboard auto-refresh. Every lab includes its system/docker/GPU summary, online flag, and alert counts in a single call, so you don't have to loop over /api/v1/labs + /api/v1/status/{lab_id}.
Response 200
{
"labs": [
{
"id": "lab_proxmox-01_x7k2",
"hostname": "proxmox-01",
"online": true,
"cpu_percent": 23.5,
"memory_percent": 62.1,
"disk_percent": 45.2,
"container_count": 12,
"alert_count": 1,
"critical_count": 0
}
],
"total": 1,
"total_alerts": 1
}Uptime Timeline
Returns online/offline segments for every lab the current session can see, over the requested window. Used to draw the uptime timeline widget on the dashboard. When called without a session it returns segments for all labs.
Query Parameters
| Param | Type | Default | Description |
|---|---|---|---|
| hours | int | 24 | Size of the timeline window |
Response 200
Keyed by lab_id. Each entry carries the display hostname plus a list of segments. Segment status is one of online, offline, stale, or no-data (the period before the lab was registered).
{
"lab_proxmox-01_x7k2": {
"hostname": "proxmox-01",
"real_hostname": "proxmox-01",
"segments": [
{"start": "2026-03-18T22:00:00", "end": "2026-03-19T03:12:00", "status": "online"},
{"start": "2026-03-19T03:12:00", "end": "2026-03-19T03:14:00", "status": "offline"},
{"start": "2026-03-19T03:14:00", "end": "2026-03-19T22:00:00", "status": "online"}
]
}
}Alert Feed
Returns the most recent alerts across all labs the current session owns. Powers the alerts rollup widget on the dashboard.
Query Parameters
| Param | Type | Default | Description |
|---|---|---|---|
| limit | int | 20 | Max number of alerts to return |
Response 200
[
{
"id": 42,
"lab_id": "lab_proxmox-01_x7k2",
"hostname": "proxmox-01",
"severity": "warning",
"alert_type": "disk_high",
"message": "/ is at 82%",
"data": {},
"created_at": "2026-03-19T22:11:04",
"resolved_at": null
}
]Metric Sparkline
Returns a compact time series for a single lab and metric — just enough points to draw an inline sparkline.
Path Parameters
| Param | Type | Description |
|---|---|---|
| lab_id | string | Lab identifier |
| metric | string | One of cpu, memory, disk |
Query Parameters
| Param | Type | Default | Description |
|---|---|---|---|
| hours | int | 1 | Window size |
Response 200
Returns a downsampled list of at most ~30 points across the window.
[
{"timestamp": "2026-03-19T21:00:00", "value": 12.4},
{"timestamp": "2026-03-19T21:02:00", "value": 13.1},
{"timestamp": "2026-03-19T21:04:00", "value": 12.8}
]Errors
| Code | Meaning |
|---|---|
| 400 | Metric is not one of cpu/memory/disk |
Lab History
Returns time-series metrics for a specific lab, suitable for charting. Default window is 24 hours.
Query Parameters
| Param | Type | Default | Description |
|---|---|---|---|
| hours | int | 24 | Number of hours of history to return |
| secret | string | — | Admin secret (alternative to header) |
Response 200
{
"timestamps": ["2026-03-19T20:00:00", "2026-03-19T21:00:00"],
"cpu": [23.5, 18.2],
"memory": [62.1, 58.7],
"disk": [45.2, 45.2],
"load": [1.2, 0.8],
"net_rx": [12.5, 8.3],
"net_tx": [4.2, 2.1],
"gpu_timestamps": ["..."],
"gpu_utilization": ["..."],
"gpu_memory": ["..."],
"gpu_temperature": ["..."]
}Export Data
Exports all stored metrics for a lab as a downloadable JSON file. Useful for backups, migrations, or external analysis.
Response 200
Returns a JSON file download with all metrics, alerts, and configuration for the specified lab.
Prometheus Export
Returns all lab metrics in Prometheus text format. Point your Prometheus scraper at this endpoint to visualize labwatch data in Grafana.
Accepts authentication via X-Admin-Secret header or standard Authorization: Bearer <secret> (for Prometheus scraper compatibility).
Exported Metrics
labwatch_cpu_percent— CPU usage per nodelabwatch_memory_percent— memory usage per nodelabwatch_disk_percent— disk usage per nodelabwatch_uptime_seconds— node uptimelabwatch_containers_total— total containers per nodelabwatch_containers_running— running containers per nodelabwatch_alerts_active— active alert count per nodelabwatch_node_online— 1 if online, 0 if offline
All metrics are labeled with lab_id and hostname.
Delete Lab
Permanently removes a lab and all its stored metrics, alerts, and digests. This cannot be undone.
Response 200
{
"status": "deleted",
"lab_id": "lab_proxmox-01_x7k2"
}Natural Language Query
Ask questions about your infrastructure in natural language. The engine analyzes your metrics data and returns a plain-English answer with supporting data. Queries in German, French, Spanish, and Ukrainian are also supported.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
| question | string | required | Your question in plain English |
Example Questions
"How's my lab?"
"Why is pve-storage slow?"
"Which server uses the most CPU?"
"Am I running out of disk?"
"Disk I/O throughput"
"Any disk failures?"
"How are my ZFS pools?"
"GPU status"
"Show me recent error logs"
"What happened last night?"
"Wie geht es der Flotte?" (German)Example
curl -X POST https://your-server/api/v1/query \
-H 'X-Admin-Secret: your-secret' \
-H 'Content-Type: application/json' \
-d '{"question": "Why is pve-storage slow?"}'Response 200
{
"answer": "I/O pressure — load average 38.1 with only 2% CPU usage suggests a disk or network bottleneck, not a compute issue. Memory is fine at 62%. Check for heavy NFS traffic or failing drives.",
"query_type": "load_analysis",
"confidence": 0.85,
"sources": ["system_metrics", "load_analysis"]
}Lab Digest
Generates a plain-English intelligence digest for a specific lab, analyzing trends, anomalies, and health over the specified period.
Query Parameters
| Param | Type | Default | Description |
|---|---|---|---|
| hours | int | 168 | Number of hours to analyze (default 7 days) |
Response 200
{
"id": 12,
"hostname": "proxmox-01",
"grade": "B+",
"summary": "proxmox-01 has been running steadily over the last 7 days...",
"concerns": ["Disk at 78% — approaching warning threshold"],
"highlights": ["Zero alerts — ran clean the entire period"],
"data": {
"cpu": {"avg": 15.2, "max": 72.1, "min": 1.0, "current": 8.3},
"memory": {"avg": 62.0, "max": 68.5, "min": 55.2, "current": 63.1},
"disk": {"avg": 78.0, "max": 78.3, "min": 77.8, "current": 78.3},
"sample_count": 10080
}
}Returns the most recently generated digest for a lab without regenerating it.
Fleet Digest
Generates a comprehensive digest across all monitored nodes. Includes per-node grades, fleet-wide trends, anomalies, and recommendations.
Response 200
{
"summary": "# Fleet Intelligence Digest\n\n**5 nodes** monitored...",
"nodes": [
{"id": 12, "hostname": "proxmox-01", "grade": "A", "summary": "...", "concerns": [], "highlights": [...], "data": {...}},
{"id": 13, "hostname": "pve-storage", "grade": "C", "summary": "...", "concerns": [...], "highlights": [], "data": {...}}
],
"node_count": 5,
"concerns_count": 3
}NLQ Stats & Misses (roadmap)
These endpoints are on the roadmap for v1.1.
Returns matched/total counts, miss rate, per-intent hit counts, and the top repeated misses in the last hours window. Useful for spotting coverage gaps.
Query Parameters
| Param | Type | Default | Description |
|---|---|---|---|
| hours | int | 168 | Rolling window (default 7 days) |
Response 200
{
"hours": 168,
"total": 1842,
"matched": 1795,
"miss_rate": 0.026,
"by_type": [
{"query_type": "status", "n": 812},
{"query_type": "disk_pressure", "n": 204}
],
"top_misses": [
{"question": "show me gpu temps", "n": 4},
{"question": "any flapping services", "n": 2}
]
}Returns the most recent queries that fell through to the fallback handler, newest first. Use this to harvest new intents to add to the NLQ engine.
Query Parameters
| Param | Type | Default | Description |
|---|---|---|---|
| limit | int | 100 | Max misses to return |
Response 200
{
"misses": [
{
"id": 4821,
"asked_at": "2026-03-19T22:10:00",
"email": "admin@lab.local",
"question": "show me gpu temps",
"query_type": "fallback",
"confidence": 0.0
}
]
}List Notification Channels
Returns all configured notification channels.
Response 200
{
"channels": [
{
"id": 1,
"channel_type": "ntfy",
"name": "Phone alerts",
"config": {"topic": "labwatch-alerts"},
"enabled": true
}
]
}Create Notification Channel
Create a new notification channel. Supported types: webhook, ntfy, discord, slack, telegram, gotify, pushover, apprise.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
| channel_type | string | required | "webhook", "ntfy", "discord", "slack", "telegram", "gotify", "pushover", or "apprise" |
| name | string | required | Display name for the channel |
| config | object | optional | Channel-specific configuration (defaults to {}) |
| min_severity | string | optional | "info", "warning" (default), or "critical" |
Webhook Config
{"url": "https://hooks.slack.com/services/..."}ntfy Config
{"topic": "labwatch-alerts", "server": "https://ntfy.sh"}Response 200
{
"id": 1,
"status": "created"
}Update Notification Channel
Update an existing notification channel's name, config, or enabled status.
Delete Notification Channel
Permanently removes a notification channel.
Test Notification Channel
Sends a test message through the specified channel to verify it's configured correctly.
Response 200
{
"success": true,
"channel_id": 1,
"message": "Test notification sent successfully"
}MCP Integration (roadmap)
MCP (Model Context Protocol) support is on the roadmap. It will allow LLM tools like Claude Desktop to query your fleet metrics directly. This section will be updated when the feature ships.
Dashboard (Session) API
These endpoints power the authed dashboard UI and authenticate with the signed session cookie set by POST /login. They're documented here for completeness — for server-to-server automation prefer the admin-secret endpoints in the sections above.
Your Fleet
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/my/dashboard | Enriched labs for the auto-refresh feed (metrics, alerts, heartbeats, pins) |
| POST | /api/v1/my/add-node | Register a new node under the current user and return token + install command |
Layout & Pins
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/my/dashboard-layout | Get the saved widget layout (roadmap) |
| PUT | /api/v1/my/dashboard-layout | Save a new widget layout (roadmap) |
| DELETE | /api/v1/my/dashboard-layout | Reset layout to the default (roadmap) |
| GET | /api/v1/my/pins | List pinned labs |
| POST | /api/v1/my/pin/{lab_id} | Pin a lab |
| DELETE | /api/v1/my/pin/{lab_id} | Unpin a lab |
Alert Thresholds
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/my/thresholds | List all per-lab thresholds for the user |
| GET | /api/v1/my/thresholds/{lab_id} | Get thresholds for one lab |
| PUT | /api/v1/my/thresholds/{lab_id} | Set thresholds for one lab |
| DELETE | /api/v1/my/thresholds/{lab_id} | Reset thresholds to the default |
| POST | /api/v1/my/alerts/generate | Ask the NLQ engine to suggest thresholds from recent metrics |
| POST | /api/v1/my/alerts/apply | Apply a generated threshold set |
Per-Lab Controls
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/my/lab/{lab_id}/public-status | Get the public status page config (token, enabled) (roadmap) |
| POST | /api/v1/my/lab/{lab_id}/public-status | Enable a public status page (returns /s/{public_id}) (roadmap) |
| DELETE | /api/v1/my/lab/{lab_id}/public-status | Disable the public status page (roadmap) |
| GET | /api/v1/my/lab/{lab_id}/maintenance | Check if the lab is in a maintenance window (roadmap) |
| POST | /api/v1/my/lab/{lab_id}/maintenance | Start a maintenance window (alerts suppressed) (roadmap) |
| DELETE | /api/v1/my/lab/{lab_id}/maintenance | End the active maintenance window (roadmap) |
| POST | /api/v1/my/lab/{lab_id}/display-name | Set a custom display name for a lab (roadmap) |
Notifications
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/my/notification-prefs | Get personal notification preferences |
| PUT | /api/v1/my/notification-prefs | Update preferences (quiet hours, severities) |
| GET | /api/v1/my/notifications | List personal notification channels |
| POST | /api/v1/my/notifications | Create a new channel |
| DELETE | /api/v1/my/notifications/{channel_id} | Delete a channel |
| POST | /api/v1/my/notifications/{channel_id}/test | Send a test notification |
| POST | /api/v1/my/notifications/test-inline | Test a channel config without saving it |