Allowlisted files

Only operational docs and config. DB files, credentials, and private config are never shown.

MISSION_CONTROL_RUNBOOK.md

# Mission Control — Local Deployment Runbook

Local-first FastAPI dashboard for OpenClaw/Fakker Mission Control.
Binds to `127.0.0.1:8787`. **No public exposure.** Owner accesses it from
the Mac via an SSH tunnel only.

---

## 1. Workspace

- Host: VPS (`openclaw` user; runtime user `ubuntu`).
- Project root: `/home/openclaw/.openclaw/workspace`
- DB: `data/mission-control.sqlite3`
- Service file: `deploy/systemd/openclaw-mission-control.service`
- App entry: `mission_control.api:app`

## 2. Manual run (foreground)

Use this for quick checks or debugging. Stops when you Ctrl-C.

```bash
cd /home/openclaw/.openclaw/workspace
python3 -m mission_control.cli init        # idempotent; creates DB if missing
python3 -m uvicorn mission_control.api:app --host 127.0.0.1 --port 8787
```

Optional auth gate before exposing beyond a private loopback/tunnel:

```bash
export MISSION_CONTROL_AUTH_TOKEN='<long-random-token>'
python3 -m uvicorn mission_control.api:app --host 127.0.0.1 --port 8787
```

When `MISSION_CONTROL_AUTH_TOKEN` is set, browser users must log in at
`/login`. JSON clients can send either:

```bash
curl -H "Authorization: Bearer $MISSION_CONTROL_AUTH_TOKEN" \
  http://127.0.0.1:8787/tasks
```

or:

```bash
curl -H "X-Mission-Control-Token: $MISSION_CONTROL_AUTH_TOKEN" \
  http://127.0.0.1:8787/tasks
```

Tail health while running:

```bash
curl -s http://127.0.0.1:8787/health
curl -s http://127.0.0.1:8787/ui | head -20
```

## 3. Systemd service (recommended)

Template lives at `deploy/systemd/openclaw-mission-control.service`.
**It is not enabled automatically.** Install only when ready.

### 3a. Install

```bash
sudo cp /home/openclaw/.openclaw/workspace/deploy/systemd/openclaw-mission-control.service \
        /etc/systemd/system/openclaw-mission-control.service
sudo systemctl daemon-reload
```

### 3b. Start / status / stop

```bash
sudo systemctl start openclaw-mission-control
sudo systemctl status openclaw-mission-control
sudo systemctl stop openclaw-mission-control
```

### 3c. Enable on boot (only when stable)

```bash
sudo systemctl enable openclaw-mission-control
# undo
sudo systemctl disable openclaw-mission-control
```

### 3d. Uninstall

```bash
sudo systemctl stop openclaw-mission-control
sudo systemctl disable openclaw-mission-control
sudo rm /etc/systemd/system/openclaw-mission-control.service
sudo systemctl daemon-reload
```

## 4. Access from the Mac (SSH tunnel)

`<vps>` = the OpenClaw VPS hostname/alias in your `~/.ssh/config`.

Foreground tunnel (closes when you Ctrl-C):

```bash
ssh -N -L 8787:127.0.0.1:8787 <vps>
```

Background tunnel:

```bash
ssh -fN -L 8787:127.0.0.1:8787 <vps>
# kill
pkill -f 'ssh -fN -L 8787:127.0.0.1:8787'
```

Then in the browser on the Mac:

- Dashboard:        http://127.0.0.1:8787/ui
- Board:            http://127.0.0.1:8787/ui/board
- Approvals:        http://127.0.0.1:8787/ui/approvals
- Task detail:      http://127.0.0.1:8787/ui/tasks/<task_id>
- Read-only board:  http://127.0.0.1:8787/dashboard.html
- Health:           http://127.0.0.1:8787/health
- JSON dashboard:   http://127.0.0.1:8787/dashboard
- Task list:        http://127.0.0.1:8787/tasks
- Workers:          http://127.0.0.1:8787/workers
- Events:           http://127.0.0.1:8787/events

## 4a. Using the Web UI

The dashboard at `http://127.0.0.1:8787/ui` is the daily operator surface.
It is read-mostly with a few audit-logged write actions; it never spawns
workers or makes routing decisions.

### Overview

The page is organized top-down:

- **Header**: title, generated timestamp, health badge, refresh button.
- **Top nav**: `Overview / Workers / Actions / Tasks / Events` anchors.
- **Overview cards**: Health, Active focus, Total tasks, Blocked, Review,
  Live workers, Stale workers, Planned workers.
- **Focus & rollup**: active project, schema state, pending approvals.
- **Agent activity**: static roster snapshot.
- **Workers**: split into live / stale / planned / offline buckets.
- **Operator actions**: create task, assign worker, move status.
- **Board**: kanban-style lanes for backlog, assigned, doing, blocked, and review.
- **Approvals**: request, approve, or reject task-level approval gates.
- **Task board**: every lifecycle lane with status badges and worker/project chips.
- **Recent events**: newest first, type-coded.

### Create a task

`Actions → Create a task`. Title and Expected output are required. Domain
defaults to `general`. Project is optional. Submit returns a green
`Done. Created task <id>` banner.

### Assign a worker

`Actions → Assign a worker`. Pick the task and any agent from the roster.
Optional `Owner label override` overrides the visible label without
changing the agent reference. Submit moves the task to `assigned`.

### Move status

`Actions → Move a task`. Choose the task and the new lifecycle status.
`Verification notes` is required when moving to `done`; the service layer
rejects the transition otherwise and surfaces a red error banner.

### Open task detail

Every task ID in the dashboard is a clickable link to
`/ui/tasks/<task_id>`. The detail page shows:

- Title, status badge, worker chip, project chip in the header.
- Expected output, Verification notes, Description.
- All raw fields as a key/value list.
- **Task actions**: assign worker and move status, scoped to the task.
- **History**: full append-only event timeline, newest first.

### Command Tree (Org Chart)

The `Command Tree` section at `#org-chart` shows the reporting topology:

- **Fakker** always appears as the root orchestrator node in a prominently
  bordered card. It is always the root regardless of liveness.
- **Child worker nodes** appear only for agents that are actually registered
  (non-retired, non-Fakker). Each card shows name, liveness badge, health,
  current task, and active/blocked task counts.
- **Empty state**: if no child workers exist, the section displays:
  "No registered workers yet. Fakker is currently the only orchestrator."
  This is the correct default when only Fakker is registered.

**Agents appear only after registration or heartbeat.** Fantasy/placeholder
agents (Backend Agent, Frontend Agent, Infra Agent, QA Agent, Backoffice Agent,
Codex Agent) are not seeded by default and must not appear in this view.
Use `admin retire-placeholder-agents` to remove any that exist from early seeds.

**Paperclip is a design reference only.** The org-chart layout is inspired by
Paperclip's governance-style agent console. Paperclip is not a dependency,
not imported, not a framework used here. Fakker remains the orchestrator brain;
Mission Control remains the state board.

### Agent Workload

The `Agent Workload` section at `#workload` shows a per-agent task count table:

- Columns: Assigned / In Progress / Blocked / Review / Done / Total / Current Task.
- Only tasks with an `owner_agent_id` that matches a registered agent are counted.
- Tasks without an assigned agent are excluded from this view.

### Understand worker buckets

Workers are bucketed by liveness; the buckets are visually distinct:

- **Live**: heartbeat within the stale threshold. Treated as actually working.
- **Stale**: heartbeat older than the stale threshold. Treat as offline.
- **Planned / roster (not live yet)**: defined in the agent roster but
  never heartbeat. Future roles, italicized.
- **Offline / retired**: explicit `offline` or `retired` status.

The header shows the current stale threshold (default 300s).

## 4c. Test task SOP

Use this end-to-end check whenever the dashboard or service deployment
changes. Every step records an audit event.

1. **Create a test task.**

   Via UI: open `/ui`, in *Operator Actions → Create a task* enter:
   - Title: `runbook smoke <date>`
   - Expected output: `Task visible in /ui and detail page`
   - Domain: `qa`
   - Priority: `low`
   - Project: leave blank

   Or via CLI:
   ```bash
   cd /home/openclaw/.openclaw/workspace
   python3 -m mission_control.cli create-task \
     --title "runbook smoke" \
     --domain qa \
     --expected-output "Task visible in /ui and detail page" \
     --priority low \
     --source runbook
   ```
   Note the returned `id` (e.g. `task_abc123`).

2. **Verify it appeared.**
   - Dashboard `/ui` shows it in the *Planned* lane.
   - Detail page loads at `/ui/tasks/<id>`.
   - JSON: `curl -s http://127.0.0.1:8787/tasks/<id> | head`.

3. **Assign a worker.**

   Via UI: *Assign a task to a worker*, pick the task + a roster agent.

   Or via CLI:
   ```bash
   python3 -m mission_control.cli assign-task <id> --agent agent_codex --source runbook
   ```
   Task moves to *Assigned*; agent shows in the assignment.

4. **Move the task through statuses.**

   Via UI: *Move a task to a new status*, pick `in_progress`. Repeat
   for `review`. For `done`, include verification notes
   (e.g. `runbook smoke validated`).

   Or via CLI:
   ```bash
   python3 -m mission_control.cli move-task <id> in_progress --source runbook
   python3 -m mission_control.cli move-task <id> review --source runbook
   python3 -m mission_control.cli move-task <id> done \
     --verification-notes "runbook smoke validated" --source runbook
   ```

5. **Verify history.**
   - Detail page lists every transition under *History*.
   - JSON: `curl -s "http://127.0.0.1:8787/tasks/<id>/history" | head -40`.

6. **Cancel a test task (preferred over deletion).**

   The schema does not expose a destructive delete on tasks; cancel
   instead, which is the supported terminal state:

   Via UI: *Move a task to a new status* → `cancelled`. No verification
   notes required.

   Or via CLI:
   ```bash
   python3 -m mission_control.cli move-task <id> cancelled --source runbook
   ```

   Cancelled and done tasks stay in the DB as audit history. To remove
   the row entirely, take a `python3 -m mission_control.cli backup`
   first, then open the SQLite DB manually and `DELETE FROM tasks
   WHERE id = '<id>'` — **destructive SQL requires explicit owner
   approval per CLAUDE.md.**

## 5. Logs

### Systemd-managed run

```bash
journalctl -u openclaw-mission-control -f
journalctl -u openclaw-mission-control --since '1 hour ago'
```

### Manual run

Logs go to the terminal where uvicorn is running. Redirect if needed:

```bash
python3 -m uvicorn mission_control.api:app --host 127.0.0.1 --port 8787 \
  >> /tmp/mission-control.out 2>&1
```

## 6. Backup

Always back up before destructive DB work.

```bash
cd /home/openclaw/.openclaw/workspace
python3 -m mission_control.cli backup
# custom directory:
python3 -m mission_control.cli backup --output-dir data/backups
```

Backups land under `data/backups/` with a timestamped filename.

## 7. Tests / verification

Standard verification before any commit that touches `mission_control/`:

```bash
cd /home/openclaw/.openclaw/workspace
python3 -m compileall mission_control tests
python3 -m unittest discover -s tests
python3 -m mission_control.cli init
python3 -m mission_control.cli dashboard
```

API + UI sanity:

```bash
curl -s http://127.0.0.1:8787/health
curl -sI http://127.0.0.1:8787/ui | head -3
```

## 8. Rollback

If a release breaks the dashboard:

1. Stop the service:
   ```bash
   sudo systemctl stop openclaw-mission-control
   ```
2. Restore last known-good code:
   ```bash
   cd /home/openclaw/.openclaw/workspace
   git log --oneline -10
   git checkout <known-good-sha> -- mission_control/ tests/
   ```
3. Restore DB only if data corruption is suspected:
   ```bash
   cp data/backups/<chosen-backup>.sqlite3 data/mission-control.sqlite3
   ```
4. Re-run tests, restart service.
5. If still broken, revert the working tree:
   ```bash
   git reset --hard <known-good-sha>   # destructive; only with owner approval
   ```

## 9. Agent roster — actual agents vs future worker roles

### Actual registered agents

Only agents that have been deliberately created appear in Mission Control.
The current roster is Fakker only. Fakker is the sole OpenClaw orchestrator.

Codex/Claude are development tools — not registered agents — unless explicitly
added via `python3 -m mission_control.cli create-agent`.

### Registering a new real worker

When a real worker is ready to report in, register it before it heartbeats:

```bash
python3 -m mission_control.cli create-agent \
  --name "Worker Name" \
  --role "what it does" \
  --status planned
```

Workers self-register on first heartbeat using `mission_control.workers.register_worker`.

### Cleaning placeholder agents from the live database

Early Mission Control seeds created placeholder agents for planned roles
(Backend Agent, Backoffice Agent, Codex Agent, Frontend Agent, Infra Agent,
QA Agent). These are not real workers. Use the admin command to retire them:

```bash
cd /home/openclaw/.openclaw/workspace
python3 -m mission_control.cli admin retire-placeholder-agents
```

This command:
- Retires only the six known placeholder agent IDs
- Never touches Fakker (`agent_fakker`) or any custom agent
- Is safe to run multiple times (idempotent)
- Emits an audit event for each agent retired
- Prints a JSON result + a summary line

After running, restart the service so the UI reflects the change:

```bash
sudo systemctl restart openclaw-mission-control
```

## 10. Current limitations — must hold before any public exposure

- **Auth is token-based only.** Set `MISSION_CONTROL_AUTH_TOKEN` before
  exposing beyond loopback/tunnel. There is no user database or roles yet.
- **No TLS.** Plain HTTP on loopback.
- **No rate limiting / abuse controls.**
- **Forms accept urlencoded body without CSRF tokens.** Keep the surface
  loopback-bound or behind trusted TLS/auth until CSRF protection exists.
- **No remote workers.** Workers must run on the same VPS (or tunnel back)
  to reach Mission Control over loopback.
- **No multi-user identity.** Every UI action is recorded with
  `source="ui"`; there is no per-operator audit.
- **Mission Control is recording-only.** It does not dispatch, schedule,
  retry, or notify. All brain logic still belongs to Fakker.

Until each item above is addressed, do not bind the service to
`0.0.0.0`, do not put it behind a public reverse proxy, and do not open
firewall ports for it.