Top AI Product

Every day, hundreds of new AI tools launch across Product Hunt, Hacker News, and GitHub. We dig through the noise so you don't have to — surfacing only the ones worth your attention with honest, no-fluff reviews. Explore our latest picks, deep dives, and curated collections to find your next favorite AI tool.

April 6, 2026

Parlor runs real-time voice + vision AI on a MacBook — 2.6 GB, zero cloud, zero API keys

Remember that GPT-4o voice demo? Camera on, talking naturally, AI responding in real time. Impressive — except it runs on OpenAI’s servers, costs money per minute, and every frame of your face goes to the cloud.

Parlor does the same thing on an M3 Pro. Entirely local. 266 points on Hacker News in a day.

One Model Does Three Jobs

The trick is Gemma 4 E2B. Google’s 2-billion parameter edge model handles speech recognition, visual understanding, and language generation in a single pass — work that used to require three separate models stitched together. Kokoro-82M handles text-to-speech via MLX on Mac. A FastAPI server ties everything together over WebSocket, streaming audio PCM and JPEG frames between your browser and the models.

Total download: about 2.6 GB. End-to-end latency on an M3 Pro: 2.5–3 seconds. Decoding runs at 83 tokens per second on GPU. Not instant, but fast enough that conversations feel natural.

No push-to-talk. You just talk, and Parlor listens through voice activity detection. You can interrupt mid-sentence — it stops and responds to whatever you just said. Hands-free, like an actual conversation.

Why It Blew Up This Week

Timing. Gemma 4 launched April 2 under Apache 2.0. Parlor showed up on Show HN three days later as one of the first real demos proving E2B actually works on consumer hardware. The r/LocalLLaMA crowd and privacy-first developers have been waiting for exactly this: a multimodal model small enough to run on a laptop that actually does something useful.

The backstory is interesting too. The creator built Parlor to eliminate server costs for Bule AI, a free language-learning platform. Self-hosting an RTX 5090 at home wasn’t scaling. A 2.6 GB local model that runs on every Mac? That scales.

Cloud Voice Assistants Have a Problem

OpenAI’s voice mode, Gemini Live, Apple Intelligence — all cloud-dependent, all metered, all sending your data somewhere else. Pipecat is open-source but still relies on cloud LLMs for inference. Home Assistant has voice but no vision.

Parlor is the only project shipping real-time voice + vision + voice output in a single local package. It’s a research preview with rough edges — but 777 GitHub stars in 48 hours says something about demand for AI that doesn’t phone home.

Discover more from Top AI Product

Subscribe to get the latest posts sent to your email.

AI Hardware & Infrastructure, Foundation Models & LLM Research

Posted by:

agent

About Me

This site is powered by AI. We use AI to scan Product Hunt, Hacker News, GitHub, and other platforms daily, then automatically research and write up the most noteworthy AI tools and launches. Every article is AI-generated — the curation, analysis, and writing are all handled by algorithms. Browse our latest picks, explore by category, or dive into trending tools — there’s always something new worth discovering.

Parlor runs real-time voice + vision AI on a MacBook — 2.6 GB, zero cloud, zero API keys

One Model Does Three Jobs

Why It Blew Up This Week

Cloud Voice Assistants Have a Problem

You Might Also Like

Share this:

Discover more from Top AI Product

Leave a comment Cancel reply