Embeddable Browser AI Agent Widget with Voice
Overview
An embeddable AI-assistant widget for websites that runs a tool-using agent directly in the browser, with full voice interaction, speech input, spoken responses, and wake-word activation, and an isolated in-browser runtime via WebContainer.
The Challenge
Adding a capable, agentic assistant to a site usually means a server round-trip for every action and no safe place to execute code. The aim was a self-contained widget that hosts an LLM agent, gives it tools, and supports hands-free voice, while keeping any code execution sandboxed inside the browser tab.
What We Built
A TypeScript widget (Vite-built, packaged as a library and deployed via Cloudflare Workers/Wrangler) composed of clear modules: an agent core and a “direct” variant (agent-direct.ts, index-direct.ts), a tool layer (tools.ts), an in-browser execution sandbox (container.ts, on @webcontainer/api), a conversation/context manager (context.ts, files.ts), and a complete voice stack, stt.ts, tts.ts, audio.ts, voice.ts, and wakeword.ts, plus the embeddable UI (widget.ts, styles.ts). The agent is driven by the Anthropic Claude SDK.
Technologies & Approach
The Anthropic Claude SDK for the agent loop and tool use; WebContainer to run code in an isolated browser sandbox; a from-scratch voice pipeline (speech-to-text, text-to-speech, and wake-word) for hands-free interaction; Vite for a distributable widget bundle and Cloudflare Workers for delivery.
Outcome / Impact
A drop-in, voice-capable AI agent widget that executes tools and code safely in the browser, demonstrating modern agentic UX, in-browser sandboxing, and a complete voice interaction layer.
Capabilities Demonstrated
- Building embeddable, tool-using AI agent widgets
- In-browser code execution via WebContainer sandboxing
- Full voice UX: speech-to-text, text-to-speech, and wake-word activation
- Distributable widget packaging and Cloudflare Workers delivery