OpenWebGoggles - Web HITL UI for AI Agents

The Problem

Terminal Bottleneck

Your agent finishes a security audit with 12 findings. It dumps them into the terminal and asks you to type "approve" or "reject" twelve times. One at a time. In a monospace wall of text.

No Rich Interaction

Agents can write code, run tests, call APIs. But they can't pull up a side-by-side diff view, a severity dropdown, or a multi-step wizard. The output medium limits the workflow.

Unstructured Responses

When the agent asks "any changes?", your answer is free text. The agent has to parse intent from natural language instead of getting a clean data object with exactly the fields it needs.

How It Works

Three JSON files. That's the entire interface between the agent and the browser.

Agent Writes state.json

→

Server WebSocket bridge

→

Browser Renders UI

→

Human Makes decisions

state.json

Agent → Browser

The agent writes what it wants to show: data, UI schema, and requested actions. The dynamic renderer turns this into a styled interactive panel.

actions.json

Browser → Agent

When the human clicks approve, fills a form, or makes a selection, the browser writes structured response data that the agent reads directly.

manifest.json

Shared

Session config: ports, app name, auth tokens. Generated at startup, used by both sides. Debug the whole system by reading three files.

Features

MCP

MCP Native

Install with pip, add three lines to .mcp.json, restart your editor. Four tools appear. Works with Claude Code and any MCP-compatible agent.

DYN

Dynamic Renderer

The agent writes JSON, not HTML. The built-in renderer handles text, forms, items, actions — approval flows, wizards, dashboards — from a schema.

SDK

Custom Apps

When JSON schemas aren't enough, scaffold a custom app with one command. Vanilla JS SDK, zero dependencies. Full control over rendering.

SEC

9-Layer Security

Localhost binding, bearer tokens, Ed25519 signatures, HMAC actions, nonce replay prevention, CSP, XSS filtering, rate limiting. All on by default.

Bash Interface

Shell scripts get the same capabilities as MCP. start_webview, write_state, wait_for_action, stop_webview. Useful for orchestration scripts and debugging.

Real-Time Updates

WebSocket push from server to browser. The agent updates state.json, the browser reflects it instantly. Live dashboards, progress bars, streaming results.

Quick Start

Three minutes to your first interactive agent UI.

Install

pip install openwebgoggles

Configure MCP

{
  "mcpServers": {
    "openwebgoggles": {
      "command": "openwebgoggles"
    }
  }
}

Tell Your Agent

"Show me a review UI for these changes and wait for my approval."

The agent figures out the schema, calls webview_ask, and you see a panel in your browser.

Use Cases

Security Triage

Agent runs an audit, opens a tabbed wizard — one finding per screen, severity dropdowns, analyst notes, progress bar. Reads back structured decisions.

Per-finding approve/reject
Editable severity levels
Analyst notes preserved

Code Review

Side-by-side diffs with per-file toggles, approve/reject with comments. The agent gets clean structured feedback instead of parsing free text.

Unified diff display
File-level approvals
Comment threading

Configuration Wizards

Multi-step forms for deployment configs, database migrations, pipeline settings. The agent guides, the human decides, and the result is a clean config object.

Multi-step flows
Validation built-in
JSON output

Live Dashboards

Non-blocking display for long-running operations. The agent streams progress updates while the human monitors — or intervenes when needed.

Real-time WebSocket updates
Progress indicators
Optional human override

The Entire API

Four MCP tools. That's it.

webview_ask(state)

Show a UI and block until the human responds. The workhorse — approvals, forms, triage flows.

webview_show(state)

Show a UI without blocking. Dashboards, progress displays, anything the human watches but doesn't act on immediately.

webview_read()

Poll for actions without blocking. Check if the human has responded yet, then continue or wait.

webview_close()

Close the session. Cleans up the server, browser tab, and ephemeral crypto keys.

Security

Nine defense layers, all on by default. The agent and the browser are on the same machine — nobody else should be able to read or tamper with the communication.

Localhost-only binding Server only listens on 127.0.0.1

Bearer token auth 32-byte session token, constant-time comparison

WebSocket first-message auth Token verified before any data flows

Ed25519 signatures Server signs every state update

HMAC-SHA256 Browser signs every action

Nonce replay prevention Each action submittable only once

Content Security Policy Per-request nonce blocks inline script injection

SecurityGate 22 XSS patterns, zero-width char detection

Rate limiting 30 actions per minute per session

All crypto keys are ephemeral — generated in memory at session start, zeroed on shutdown, never written to disk. Test suite covers OWASP Top 10, MITRE ATT&CK, and LLM-specific vectors.

Get Started

OpenWebGoggles is open source under the Apache 2.0 license. Install from PyPI or clone the repo.

GitHub Repository PyPI Package

pip install openwebgoggles