Top AI Product

We track trending AI tools across Product Hunt, Hacker News, GitHub, and more  — then write honest, opinionated takes on the ones that actually matter. No press releases, no sponsored content. Just real picks, published daily.  Subscribe to stay ahead without drowning in hype.


OculOS Turns Every Desktop App Into a JSON API, and It’s Exactly What AI Agents Needed

There’s a certain kind of project that makes you stop scrolling and think “wait, why didn’t this exist already?” [OculOS](https://github.com/huseyinstif/oculos) is one of those. It’s a tiny Rust daemon — around 3 MB — that reads your operating system’s accessibility tree and turns every button, text field, checkbox, and menu item on your screen into a REST API endpoint. No screenshots, no pixel coordinates, no browser extensions. Just structured JSON.

I spotted it on [Show HN](https://news.ycombinator.com/item?id=47235427) earlier this month where it was posted with the tagline “Any desktop app as a JSON API via OS accessibility tree,” and it immediately clicked. The idea is dead simple: OculOS assigns each UI element a session-scoped UUID, and you interact with it through standard HTTP calls. Want to click a button? `POST /interact/{id}/click`. Need to type into a search box? `POST /interact/{id}/set-text`. The API even tells you what actions are available for each element, so you’re never guessing.

What really got people talking, though, is the built-in MCP server mode. Launch it with `–mcp` and suddenly Claude, Cursor, or Windsurf can control any desktop app out of the box. The repo shows a demo of Claude Code autonomously opening Spotify, searching for a song, and hitting play — all without a single screenshot or vision model involved. It’s fast, deterministic, and doesn’t need a GPU.

The dashboard is a nice touch too. Hit `http://127.0.0.1:7878` and you get a web UI with a live element tree, an inspector, and an interaction recorder that can export your actions as Python, JavaScript, or curl commands. Super handy for prototyping automation scripts before handing them off to an agent.

It runs on Windows, Linux, and macOS, each using native accessibility backends — UI Automation on Windows, AT-SPI2 on Linux, and the AXUIElement API on macOS. The whole thing is MIT licensed and has zero external dependencies beyond the Rust standard library and OS-level APIs. If you’ve been looking for a way to give AI agents actual hands on the desktop without the fragility of vision-based approaches, [OculOS](https://github.com/huseyinstif/oculos) is worth a serious look.


Discover more from Top AI Product

Subscribe to get the latest posts sent to your email.



Leave a comment