Browsy - Zero-Render Browser Engine for AI Agents

Install

One command, any language

Rust, Python, JavaScript/TypeScript, CLI, or MCP server. Pick your ecosystem.

JS

JavaScript / TypeScript

$ npm install browsy-ai

LangChain.js, OpenAI, Vercel AI SDK integrations via subpath imports.

npm Docs

PY

Python

$ pip install browsy-ai

LangChain, CrewAI, OpenAI, AutoGen, Smolagents integrations via extras.

PyPI Docs

RS

Rust / CLI

$ cargo install browsy

CLI binary, REST API server, MCP server, or use as a Rust library.

crates.io Docs

Integrations

Plug into your stack

First-class integrations for the frameworks agents actually use. One install, immediate tool access.

Featured OPENCLAW PLUGIN

OpenClaw & SimpleClaw

Drop-in plugin that auto-starts a browsy server and injects 14 browsing tools into every agent. Transparent 10x speed upgrade over Playwright — agents don't even know the difference.

Per-agent session isolation, automatic server lifecycle, and optional Playwright interception. Works with any OpenClaw-compatible framework.

Integration Guide npm

openclaw.config.ts

// One line. Every agent gets browsy.

import { register } from "openclaw-browsy";

export default { register };

// Agents automatically get 14 tools:

// browsy_browse, browsy_click,

// browsy_type_text, browsy_search,

// browsy_login, browsy_find, ...

LangChain

Python & JS/TS. Individual tools or full toolkit.

OpenAI

Function calling with JSON Schema definitions.

Vercel AI

Drop into generateText() with Zod schemas.

CrewAI

Single tool with all actions for crew agents.

AutoGen

ConversableAgent-compatible browser tool.

Smolagents

HuggingFace smolagents tool interface.

MCP

Claude Code, Claude Desktop, any MCP client.

REST / A2A

Language-agnostic REST API + Google A2A protocol.

How It Works

Page intelligence, not page rendering

Browsers compute layout, then paint pixels. Browsy keeps the layout, adds page intelligence, and skips the paint entirely.

1

</>

Parse HTML

html5ever — the same battle-tested parser behind Firefox Servo — handles real-world HTML with all its quirks. We get a clean DOM tree.

2

{ }

Custom CSS Engine

Our from-scratch CSS engine: selector matching, property parsing, var() resolution, calc() evaluation, @media queries, and style inheritance. Feeds computed styles to Taffy for Flexbox + Grid layout.

3

[ ]

Spatial DOM Output

Our custom output layer: element emission, smart deduplication (34-42% reduction), landmark markers, hidden content exposure, text fallback chains, behavior detection, and delta diffing.

Benchmarks

The numbers

Benchmarked against 50+ tools. 100% detection accuracy across 39 real-world snapshots. No Chromium, no GPU, no binary dependencies.

Tool	Approach	Speed (HN)	Chars/element	Dependencies	Page Intelligence
browsy	Zero-render (Rust)	203ms	58	6MB binary	13 action types
Jina Reader	Cloud API	~1,200ms	~96	Cloud API	None
agent-browser	Playwright wrapper	~5,377ms	~157	Chromium (282MB)	None
Playwright MCP	Screenshot + a11y tree	~5s	~120	Chromium (282MB)	None
Browser Use	Playwright + vision	~5s	~150	Chromium + Python	LLM-only
Stagehand	Playwright + LLM	~5s	~140	Chromium + Node	LLM-only

Features

What no other tool does

Page intelligence, hidden content, and deterministic output. Everything an agent needs to understand the web — nothing it doesn't.

Unique

Page Intelligence

Automatic page type detection — Login, Search, Form, Article, List, Captcha, Dashboard, SearchResults, and more — with 13 action recipes and stable element IDs. Your agent gets “fill field 19, click 34” instead of a raw tree to interpret.

Discovery

Hidden Content Exposure

Dropdown menus, modals, accordion panels, tab content — it's all in the HTML, just hidden by CSS. Browsy includes it with a hidden: true flag. Agents see the full page without executing JavaScript.

Search

Built-in Web Search

Search DuckDuckGo or Google directly through browsy. Search and fetch the top N result pages in a single call — no separate search API needed.

Interaction

Session API

Navigate, click, type, select, go back, search by text or role. Full agent action vocabulary with cookie persistence and O(1) element lookup by ID.

Efficiency

Smart Deduplication

Real HTML is full of wrapper noise. Browsy detects and collapses redundant containers — 40% reduction on Hacker News, 42% on Wikipedia.

Sessions

Delta & Viewport Filtering

After first load, only changes are emitted. Filter to above-fold, below-fold, or visible-only elements. Dramatically reduces token cost for multi-step sessions.

Smart

Behavior Detection

Detects interactive patterns from HTML alone — onclick handlers, Bootstrap toggles, ARIA controls. No JS execution needed. Agents see what's behind every dropdown without clicking.

Detection

CAPTCHA & Overlay Awareness

Detects reCAPTCHA, hCaptcha, Cloudflare Turnstile, and image grid challenges from HTML structure. Cookie consent banners surfaced automatically. Your agent knows when it's blocked before wasting tokens.

Forms

Form Intelligence

Distinguishes registration, contact, login, and generic forms. Extracts field names, types, and labels. Download links identified with file extensions. 13 action types total — not just “it's a form.”

Deterministic

No LLM Variance

Page intelligence is computed, not inferred. The same HTML always produces the same Spatial DOM, the same page type, the same action recipe. Auditable, reproducible, debuggable.

Code

See it in action

Page intelligence from raw HTML. No browser, no LLM, no guessing.

Page intelligence in 200ms

Navigate to a login page. Browsy detects the page type, identifies the form fields, and gives your agent an action recipe — no LLM interpretation needed.

page_type tells you what you're looking at. suggested_actions tells you what to do. Element IDs are stable across sessions.

Hidden content included

Dropdown menus, modals, accordion panels — Browsy exposes everything Chrome's accessibility tree hides. Your agent sees the full page without executing JavaScript.

browsy fetch https://github.com/login

$ browsy fetch https://github.com/login

page_type: Login

suggested_actions:

Login { username: 19, password: 21, submit: 34 }

[19:input "Username or email address" @top-C]

[21:input "Password" @mid-C]

[34:button "Sign in" @mid-C]

// 203ms. No Chromium. No LLM needed.

// Your agent knows exactly what to do.

Who Needs This

Industries that run on server-rendered HTML

90% of the web pages agents interact with don't need a browser to understand. These industries are the sweet spot.

•

Government & Public Sector

Benefits applications, permit filings, tax forms. Government portals are server-rendered HTML — browsy's sweet spot. Deterministic output means a clean audit trail.

•

Legal Tech

Court filing systems like PACER, state court portals, regulatory filings. Legacy server-rendered HTML with high compliance requirements. Auditable, deterministic output.

•

Healthcare & Insurance

Prior authorization forms, patient portals, insurance claim portals. HIPAA-friendly: browsy never renders PHI visually, never executes JS, never stores screenshots.

•

Financial Services

Banking portals, loan applications, KYC forms, SEC EDGAR filings. Extract mortgage rates from 50 banks hourly. No browser fingerprint reduces detection risk.

•

E-commerce at Scale

Product catalog extraction, price monitoring, inventory checking. At scale, Chromium instances cost $0.10/hour each. Browsy processes thousands of pages per minute on a single machine.

•

HR & Recruiting

Job board data extraction, application form filling across career portals. Apply to 50 matching job postings by filling out each employer's portal automatically.

FAQ

Frequently asked questions

Everything you need to know about Browsy and how it works.

What is Browsy?

Browsy is a zero-render browser engine built for AI agents. It parses HTML, computes layout with a custom CSS engine, and outputs a Spatial DOM with page intelligence — including page type detection, action recipes, and stable element IDs — all in about 200ms without launching a browser.

How fast is Browsy compared to Puppeteer?

Browsy is 26x faster than Chromium-based tools like Puppeteer and Playwright. It processes pages in approximately 203ms compared to 5+ seconds for browser-based approaches, because it skips rendering entirely and works directly with HTML.

Does Browsy need Chrome installed?

No. Browsy is a standalone 6MB binary with zero dependencies. It does not use Chromium, Puppeteer, Playwright, or any browser engine. It has its own HTML parser (html5ever) and custom CSS engine built in Rust.

How do I install Browsy?

Install via your preferred package manager: npm install browsy-ai for JavaScript/TypeScript, pip install browsy-ai for Python, or cargo install browsy for Rust/CLI. Browsy also runs as an MCP server and REST API.

What can Browsy extract from web pages?

Browsy extracts a Spatial DOM with element positions, page type classification (Login, Search, Article, Dashboard, etc.), 13 action recipes with stable element IDs, hidden content from dropdowns and modals, form field detection, CAPTCHA awareness, and smart deduplication that reduces output by 34–42%.

Is Browsy free?

Yes. Browsy is open source under the MIT license and free forever. The source code is available on GitHub.

Does Browsy support stealth mode?

Browsy operates fundamentally differently from browser-based tools — it never launches a browser, so there is no browser fingerprint to detect. It fetches HTML directly and parses it locally, which inherently avoids browser fingerprinting and detection mechanisms that target automated Chromium instances.

The browser engine that understands web pages