web-eval-agent

Official

Autonomous browser agent that evaluates and debugs your web app end-to-end from your IDE.

Unverified

stdio (local)

API key

Python

View repo 1.2k Website

Add to your client

Copy the config for your MCP client and paste it into its config file.

Install / run

curl -LSf https://operative.sh/install.sh -o install.sh && bash install.sh && rm install.sh

Paste into ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "web-eval-agent": {
      "command": "uvx",
      "args": [
        "--refresh-package",
        "webEvalAgent",
        "--from",
        "git+https://github.com/Operative-Sh/web-eval-agent.git",
        "webEvalAgent"
      ],
      "env": {
        "OPERATIVE_API_KEY": "<YOUR_KEY>"
      }
    }
  }
}

Requires `uv` (the Python package runner). Install it from https://docs.astral.sh/uv/ if `uvx` is not found.

Step-by-step guides: Add to Claude Desktop · Add to Cursor · Add to Windsurf

Before you start

A free OPERATIVE_API_KEY from operative.sh/mcp
uv (installed via curl -LsSf https://astral.sh/uv/install.sh | sh)
Playwright with Chromium (npm install -g chromium playwright && uvx --with playwright playwright install --with-deps)
A running web app to evaluate (e.g. http://localhost:3000)

About web-eval-agent

An MCP server from operative.sh that autonomously evaluates web applications. It launches a BrowserUse/Playwright-powered agent that navigates your running webapp, captures network traffic and console errors, and returns a UX report so a coding agent can debug itself. Installs via a one-click integration or a manual uvx setup, and is intended to be called from IDE chat in Cursor, Cline, or Windsurf.

Tools & capabilities (2)

web_eval_agent

Automated UX evaluator that drives the browser to perform a natural-language task, captures screenshots, console & network logs, and returns a rich UX report. Required args: url (address of the running app, e.g. http://localhost:3000) and task (what to test). Optional: headless_browser (default false; set true to hide the browser window).

setup_browser_state

Opens an interactive (non-headless) browser so you can sign in once; the saved cookies/local-storage are reused by subsequent web_eval_agent runs. Optional arg: url (page to open first, handy to land directly on a login screen).

What this server can do

web-eval-agent provides tools for these capabilities — tap one to see every MCP server that does the same:

Automate a browser

When to use it

Have a coding agent automatically test that a feature it just wrote works end-to-end
Run through flows like signup or login and get a report of UX issues
Capture console errors and filtered network requests during an automated browser run
Sign in once and reuse the authenticated session across multiple evaluation runs

Security notes

Requires a free OPERATIVE_API_KEY from operative.sh/mcp, passed via the env block. The agent drives a real browser against your running app and can sign in via setup_browser_state, which persists cookies and local-storage locally for reuse. Note: this project has been discontinued by its maintainers.

web-eval-agent FAQ

Is this project still maintained?

No. The README states the project has been sunset/discontinued; the maintainers are building something new at withrefresh.com.

How do I get an API key?

Get a free API key at operative.sh/mcp. When you create it you also get an 'Add to Cursor' deeplink and a prefilled Claude Code command with the key included.

Updates aren't showing up in my editor. What do I do?

Run `uv cache clean` and refresh/restart the MCP server in your code editor to pull the latest version.

Does it run on Windows?

Yes, via the install.sh script and uvx, though the README notes Windows support is still being refined.

#browser #playwright #browseruse #qa #testing #debugging #ux #vibe-coding #mcp

Alternatives to web-eval-agent

Compare all alternatives →

Bright Data MCP

Browser Automation

5.0k

All-in-one web access MCP — Web Unlocker, SERP, Scraper API, and a cloud Scraping Browser.

Verified

stdio (local)

API key

JavaScript

12 tools

Updated 18 days agoRepo

Playwright MCP (ExecuteAutomation)

Browser Automation

4.5k

Popular community Playwright + API testing MCP server with codegen, screenshots, and device emulation.

Verified

stdio (local)

No auth

TypeScript

12 tools

Updated 1 month agoRepo

Browserbase MCP (Stagehand)

Browser Automation

3.0k

Official Browserbase cloud-browser MCP built on Stagehand — natural-language act/extract/observe.

Verified

stdio (local)

API key

TypeScript

8 tools

Updated 18 days agoRepo

Compare web-eval-agent with:

vs Bright Data MCP vs Playwright MCP (ExecuteAutomation)vs Browserbase MCP (Stagehand)vs Selenium MCP (mcp-selenium)