Top AI Product

Every day, hundreds of new AI tools launch across Product Hunt, Hacker News, and GitHub. We dig through the noise so you don't have to — surfacing only the ones worth your attention with honest, no-fluff reviews. Explore our latest picks, deep dives, and curated collections to find your next favorite AI tool.


Parlor puts real-time voice and vision AI on your laptop — 2.6 GB, no cloud, no API keys

Remember when OpenAI demo’d GPT-4o voice mode and everyone lost their minds? Camera on, voice flowing, AI responding in real time. Cool — except it runs on OpenAI’s servers, costs money, and sends your data to the cloud.

Parlor does the same thing on your MacBook. Entirely local.

What Parlor Actually Does

You open a browser tab, grant mic and camera access, and start talking. Parlor sees what your camera sees, hears what you say, and talks back — all processed on your machine. No push-to-talk button. No waiting for a full sentence before processing. You can even interrupt the AI mid-sentence, just like a real conversation.

The stack is surprisingly lean. Google’s Gemma 4 E2B handles both speech recognition and visual understanding — that’s a single 2-billion parameter model doing the work of what used to require three separate models. Kokoro handles text-to-speech via MLX on Mac. A FastAPI server ties everything together over WebSocket, feeding audio PCM and JPEG frames from your browser to the models. Total download: about 2.6 GB. Runs in real time on an M3 Pro.

Why This Matters Right Now

Gemma 4 E2B launched under Apache 2.0 in early 2026, and Parlor is one of the first real applications proving what this model can do on consumer hardware. Google designed E2B specifically for edge — native audio input, native vision, under 1.5 GB memory footprint with quantization. But specs on a model card don’t move people. A working demo where you talk to your laptop and it talks back? That moves people.

265 points on Show HN. 31 comments. Top 6 on bestofshowhn.com’s April rankings. The local-AI and privacy-first communities ate this up.

How It Compares

OpenAI’s voice mode and Google’s Gemini Live are the obvious benchmarks — but both are cloud-only, proprietary, and metered. Pipecat is an open-source framework for multimodal conversational AI, but it still depends on cloud LLMs for the heavy lifting. Home Assistant has voice, but no vision.

Parlor is the only project right now shipping real-time voice + vision + voice output in a single local package. It’s labeled a “research preview” and the rough edges are real — but as a proof of concept for what edge multimodal AI looks like in 2026, it’s hard to beat.


You Might Also Like


Discover more from Top AI Product

Subscribe to get the latest posts sent to your email.



Leave a comment