AI coding agents in 2026 can write full-stack applications, refactor legacy codebases, and ship production-ready features. But they have a glaring blind spot: they cannot see what they build. An agent can generate a perfectly valid React component, but it has no way to verify that the button is aligned, the modal renders correctly, or the page doesn’t throw a cascade of console errors.
ProofShot, an open-source CLI that hit 154 points and 96 comments on Hacker News within days of its Show HN launch, tackles this problem head-on. It gives any AI coding agent — Claude Code, Cursor, Codex, Gemini CLI, Windsurf, GitHub Copilot — the ability to open a real browser, interact with the page, record the session, capture errors, and bundle everything into a single reviewable HTML file.
No MCP server configuration. No cloud dependency. Just shell commands.
The Problem Every AI-Assisted Developer Knows
If you’ve used any AI coding agent for frontend work, you’ve experienced the loop: the agent writes code, you switch to the browser, you check if it looks right, you report back what’s broken, the agent fixes it, and you check again. It’s a tedious human-in-the-loop cycle that defeats the purpose of agent-driven development.
The root issue is that AI agents operate in a text-only world. They read code, write code, and parse terminal output. But UI development is inherently visual. A CSS rule that compiles without errors can still produce a layout that looks completely wrong. A JavaScript function that passes unit tests can still fail to render anything meaningful in the browser.
Several approaches have emerged to bridge this gap. Playwright MCP streams accessibility trees and screenshot data back into the LLM’s context window, but that consumes tokens rapidly and adds latency. Screenshot-based visual regression tools like Applitools and Percy are designed for CI/CD pipelines, not real-time agent workflows. Browser-use provides browser automation for AI agents but focuses on web scraping and task automation rather than development verification.
ProofShot takes a different approach: instead of trying to give the agent pixel-level understanding of a page, it gives the agent the tools to test and record what happens, then hands the proof to the human for final review.
How ProofShot Works: Start, Test, Stop
The workflow is deliberately simple — three commands that any shell-capable agent can execute.
Start wraps your dev server, opens a headless browser, and begins recording. The agent runs something like proofshot start --run "npm run dev" --port 3000. From this point, every browser action is captured on video, and server logs are being scanned.
Test is where the agent drives the browser using agent-browser commands (built on Vercel Labs’ agent-browser, which the creator describes as “far better and faster than Playwright MCP”). The agent can navigate to pages, click buttons, fill forms, scroll, and interact with the UI — all through CLI commands that work inside any coding agent’s shell.
Stop wraps up the session. ProofShot trims dead time from the recording, bundles the video, key-moment screenshots, console error logs, and server error reports into a self-contained HTML file. The developer opens this file and gets a synchronized timeline: video playback on one side, error logs and action markers on the other.
The key design decision here is that the HTML proof file is entirely self-contained. No external dependencies, no viewer app to install, no cloud upload. You can email it, drop it in a PR comment, or store it in your project directory.
Multi-Language Server Log Scanning
One of ProofShot’s most underrated features is its server-side error detection. The tool doesn’t just capture browser console errors — it actively scans your development server’s output for error patterns across 10+ programming languages:
- JavaScript/Node.js — catches unhandled rejections, Express errors, and stack traces
- Python — detects Django/Flask tracebacks and runtime errors
- Ruby/Rails — picks up ActiveRecord errors and routing failures
- Go — matches panic traces and HTTP error responses
- Java/Kotlin — identifies Spring Boot exceptions and NullPointerExceptions
- Rust — catches panic messages and unwrap failures
- PHP — detects Laravel/Symfony error outputs
- C#/.NET — matches ASP.NET Core exception logs
- Elixir/Phoenix — captures GenServer crashes and Ecto errors
This matters because many frontend bugs originate on the backend. A component might render a blank screen not because of a CSS issue but because the API endpoint is throwing a 500 error. ProofShot captures both sides of the stack in a single proof artifact.
ProofShot vs. the Alternatives
The visual verification space for AI agents is still young, and there’s no single dominant approach. Here’s how ProofShot stacks up against the main alternatives:
Playwright MCP (Microsoft) is probably the most direct comparison. It provides browser automation through the Model Context Protocol, letting agents control a browser via structured commands. The difference: Playwright MCP streams data back into the LLM’s context window, consuming tokens with every interaction. ProofShot saves everything to disk and produces a proof bundle — the agent doesn’t need to “understand” the visual output, it just needs to record it for the human to review. This makes ProofShot significantly more token-efficient.
Playwright CLI (also from Microsoft, launched in early 2026) takes a similar disk-first approach to reduce token consumption. But it’s focused on testing automation rather than proof-of-work recording. ProofShot’s video recording and HTML proof bundle are unique to its approach.
Vercel’s agent-browser is what ProofShot is built on top of. Agent-browser provides the underlying browser automation capabilities, while ProofShot adds the recording, error collection, dead-time trimming, and proof bundling layer. Think of agent-browser as the engine and ProofShot as the complete car.
Traditional visual regression tools (Applitools, Percy, BackstopJS) are designed for comparing screenshots against baselines in CI pipelines. They’re powerful for regression testing but aren’t built for the real-time agent development loop that ProofShot targets.
The bottom line: if you want your AI agent to produce verifiable evidence that the code it wrote actually works in a browser, ProofShot is currently the most focused tool for that specific job.
Why the Hacker News Community Responded
ProofShot’s Show HN thread generated 96 comments — unusually high engagement for a developer tool launch. Several themes emerged from the discussion.
Developers appreciated the “proof artifact” concept. Instead of asking the AI agent to describe what it sees (which is unreliable) or trusting it blindly (which is risky), ProofShot produces tangible evidence that a human can audit. One commenter compared it to a “build receipt” — the agent doesn’t just claim the feature works, it shows you.
The agent-agnostic design was another draw. ProofShot doesn’t require a specific IDE, a specific agent, or a specific framework. If your agent can run shell commands, it can use ProofShot. This matters in a market where developers are frequently switching between agents or using multiple agents for different tasks.
Some commenters raised questions about the approach — whether agents should eventually develop their own visual understanding rather than relying on recorded proof, and whether the human review step would become a bottleneck at scale. Valid concerns, but for the current state of AI coding agents in 2026, the “trust but verify” approach fills a real gap.
The Neuron featured ProofShot in its March 25 AI daily briefing, and DEV Community listed it among the week’s most important AI tool releases — both signs that the project resonated beyond the Hacker News audience.
Getting Started
Installation is two commands:
- Install the CLI and agent-browser (includes headless Chromium):
npm install -g proofshot - Auto-detect your AI coding tools and install the ProofShot skill:
proofshot install
The second command scans for installed agents and configures ProofShot at the user level, so it works across all your projects without per-project setup.
ProofShot is fully open-source under the MIT license. There’s no paid tier, no cloud service, no telemetry. The entire tool runs locally on your machine.
FAQ
Is ProofShot free?
Yes. ProofShot is open-source under the MIT license with no paid plans, no cloud service, and no usage limits. Everything runs locally.
Which AI coding agents does ProofShot support?
Any agent that can execute shell commands. This includes Claude Code, Cursor, Codex, Gemini CLI, Windsurf, GitHub Copilot, Cline, and others. The tool is completely agent-agnostic.
How does ProofShot compare to Playwright MCP?
Playwright MCP streams browser data back into the LLM context window, which consumes tokens quickly. ProofShot saves everything to disk and produces a self-contained HTML proof file, making it more token-efficient. ProofShot is built on Vercel Labs’ agent-browser, which its creator describes as faster than Playwright MCP for agent use cases.
Does ProofShot work with backend-only projects?
ProofShot is designed for projects with a visual UI component — web apps, SPAs, server-rendered pages. For pure API or backend projects, traditional testing tools are a better fit. However, ProofShot’s server log scanning can still catch backend errors during frontend testing sessions.
What languages does ProofShot support for server log scanning?
ProofShot scans server output for error patterns in JavaScript/Node.js, Python, Ruby/Rails, Go, Java/Kotlin, Rust, PHP, C#/.NET, Elixir/Phoenix, and more. It uses pattern matching to detect common error formats across these ecosystems.
You Might Also Like
- Claude Channels Scores 375 Points on Hacker News Anthropics Play to Replace Openclaw
- 27k Github Stars in Weeks Learn Claude Code by Shareai lab Breaks Down ai Coding Agents Into 12 Lessons
- Claude hud hit 5 3k Github Stars Because Developers Were Flying Blind With Claude Code
- 27 Agents 109 Skills 88k Github Stars is Everything Claude Code Genius or Over Engineering
- Ensu got 328 Points on Hacker News the Privacy Crowd Wants ai That Never Phones Home

Leave a comment