Skip to content
Logo
Published on

Closing the Visual Feedback Loop in AI-Assisted UI Development

The standard UI development loop with an AI agent puts you in the middle: make a change, take a screenshot, paste it in, repeat. aieyes removes that step. The agent takes its own screenshots and sees what it produced.

The standard feedback loop in AI-assisted UI development looks something like this: you describe a change, the agent writes the code, you look at the browser, you take a screenshot, you paste it into the conversation. The agent looks at it and suggests the next change. You repeat this a dozen times to get a component right.

That manual step in the middle is not neutral. Every time you stop to take a screenshot and paste it back in, you're breaking your working context. You're also creating a gap where the agent is flying blind between changes: it produces output, waits, and has no signal about what that output actually looks like until you bring the information back to it.

I built aieyes to remove that gap. The agent takes its own screenshots and sees what it produced without you in the middle.


What it is

aieyes is a small MCP server that exposes two tools to Claude Code:

screenshot captures any URL and returns the image directly into the agent's context. It supports full-page capture, element-scoped capture via CSS selector, and custom viewport dimensions. The agent can use it exactly the way you would: navigate to a page, take a screenshot, look at it, and decide what to do next.

open_browser opens a URL in your system browser. This one sounds trivial but it fills a real gap: when the agent wants to show you something and have you confirm it, this is how it sends you there. The agent takes the screenshot for its own verification, then opens the browser for yours.

The screenshot tool is the one that changes how you work. The iteration loop that used to require you as an intermediary becomes autonomous. The agent can make a change, capture the result, compare it against what was asked for, and iterate, all within a single session without stopping to ask you to look at something.


The foundation

The screenshot capability is entirely built on shot-scraper, Simon Willison's headless browser screenshot tool. aieyes is a thin MCP wrapper around it: when Claude calls screenshot, the server shells out to shot-scraper, which drives a headless Chromium instance via Playwright, and returns the resulting PNG base64-encoded into the conversation.

I want to be explicit about this because it matters when something breaks. If a screenshot fails, the failure is almost certainly in shot-scraper or Playwright, not in aieyes itself. The tool is thin by design: one external dependency doing the real work, one MCP layer making it available to the agent.

The one honest friction point in the current setup: the path to shot-scraper is hardcoded in index.js. pip installs to a user-specific location that varies by machine and Python version, and there is no clean cross-platform way to resolve this automatically. The README tells you how to find the right path and where to update it. It is a one-time setup step, but it means the repo is not quite clone-and-run.


How I actually use it

The primary use case is iterative UI work. When I'm building a component and want the agent to verify its own output visually, I'll note in the session that the screenshot tool is available. The agent will use it at natural checkpoints: after making a change, before declaring something done, when comparing two layout options.

The selector parameter is underused and worth calling out. You can scope a screenshot to a specific element, which means the agent can focus on the component it changed without capturing irrelevant page context. That's cleaner than a full-page screenshot when you're debugging something specific.

The viewport parameters matter for responsive work. A component that looks right at 1280px can break at 390px. The agent can take both, compare them, and address the mobile layout without you needing to manually resize your browser.


What it doesn't replace

aieyes is for visual output during development iteration. It is not a substitute for live DOM inspection.

If you need to check whether a class was applied at runtime, query computed styles, verify a selector against the actual rendered markup, or confirm that a third-party script didn't inject extra wrapper elements: that's work for the Chrome DevTools MCP, not aieyes. The DevTools protocol works against the rendered DOM and can answer questions that a screenshot can't, because screenshots show what something looks like, not why.

The two tools are complementary. aieyes tells the agent whether it looks right. Chrome DevTools tells the agent what's actually in the DOM when it doesn't look right. I use both in the same sessions without conflict.


Setup

Requirements: macOS, Node.js 18+, Python 3 with pip.

pip3 install shot-scraper
shot-scraper install
which shot-scraper   # note this path
git clone https://github.com/rocksoup/aieyes.git
cd aieyes
npm install

Update the SHOT_SCRAPER path in index.js to match the output of which shot-scraper, then register it as a global MCP server in ~/.claude.json:

{
  "mcpServers": {
    "aieyes": {
      "command": "node",
      "args": ["/absolute/path/to/aieyes/index.js"]
    }
  }
}

Restart Claude Code. The screenshot and open_browser tools will be available in every session.


More on AI-assisted development