685 Hacker News Upvotes in One Day: Why CanIRun.ai Struck a Nerve with Local AI Enthusiasts

Running AI models locally has gone from niche hobby to mainstream ambition. But there’s a persistent, annoying gap between downloading a model and finding out your hardware can’t actually run it. CanIRun.ai — a free, browser-based tool that detects your GPU, CPU, and RAM to tell you exactly which models your machine can handle — landed on Hacker News on March 13, 2026 and racked up 685 points and nearly 200 comments within hours. The timing wasn’t accidental. With local AI adoption accelerating and new models dropping weekly, the question “can my machine run this?” has never been more common — or more frustrating to answer.

How CanIRun.ai Actually Works

The tool runs entirely in your browser. No downloads, no account creation, no data uploaded to any server. Under the hood, it creates a hidden WebGL canvas and queries the WEBGL_debug_renderer_info extension to identify your exact GPU model and vendor. It then cross-references your hardware against a built-in database of roughly 40 GPUs (NVIDIA, AMD, Intel) and 12 Apple Silicon chips, each with stored VRAM capacity and memory bandwidth figures.

Once your hardware is identified, CanIRun.ai evaluates a catalog of popular open-source models — from tiny ones like TinyLlama 1.1B up to frontier-class heavyweights like DeepSeek-V3 (236B parameters) and GPT-OSS 120B. Every model is assessed using Q4_K_M quantization as the baseline, which the docs describe as the “best balance of size and quality” at roughly 88% quality retention. The result is a tier rating for each model on your specific hardware:

S tier — Runs great
A tier — Runs well
B tier — Decent
C tier — Tight fit
D tier — Barely runs
F tier — Too heavy

The scoring factors in both VRAM capacity (can the quantized model fit in memory?) and memory bandwidth (how fast will it generate tokens?). For context, the docs note that an RTX 4090 delivers 1,008 GB/s bandwidth while an M4 Pro sits at 273 GB/s — a difference that translates directly into tokens-per-second output.

There’s also a Python CLI companion (pip install canirun) built for more programmatic use. It fetches model configs from Hugging Face Hub, calculates parameter memory requirements, estimates KV cache sizes, and checks compatibility across multiple quantization levels (2-bit, 4-bit, 8-bit, 16-bit). Currently optimized for standard Transformer architectures like Llama, Mistral, and Gemma, with experimental MoE support.

What the Hacker News Community Actually Said

The comment section told a more nuanced story than the upvote count alone. Several themes emerged from the nearly 200 comments:

Conservative estimates. Multiple users reported that CanIRun.ai underestimates what their hardware can actually do. One commenter noted they could run GPT-OSS 120B at around 40 tokens per second on their setup, yet the site rated it as incompatible. The tool’s calculations don’t always account for partial offloading strategies or optimized inference runtimes that squeeze more out of available memory.

The MoE blind spot. A recurring criticism was that the tool doesn’t properly handle Mixture-of-Experts models. MoE architectures like Mixtral 8x7B only activate a subset of parameters per token (roughly 12.9B of 46.7B total), meaning they need the full VRAM to store the model but use less compute per inference step. CanIRun.ai’s current approach treats them more like dense models, which leads to overly pessimistic ratings.

Hardware buying guide. Despite the accuracy quibbles, the consensus was that CanIRun.ai is most useful as a pre-purchase research tool. If you’re deciding between an RTX 4060 and an RTX 4090, or debating whether to spring for the 128GB MacBook Pro, the tier list gives you a quick visual of what each option unlocks.

Model recommendations. The thread became an impromptu model recommendation engine, with Qwen 3.5 9B emerging as a community favorite for local use — praised for tool use, information extraction, and solid performance on modest hardware.

CanIRun.ai vs. the Competition

CanIRun.ai isn’t the only tool trying to solve this problem, but it takes a distinct approach. Here’s how the landscape looks:

Can I Use LLM (caniusellm.com) offers a similar browser-based check. You select your hardware manually from dropdown menus, which gives it an edge in accuracy but sacrifices the instant auto-detection that makes CanIRun.ai feel frictionless.

WhatModelsCanIRun.com takes a VRAM-centric approach — you plug in your available VRAM and it shows compatible models. Simpler, but it ignores bandwidth entirely, which matters a lot for actual inference speed.

VRAM calculators on Hugging Face and other sites let you calculate memory requirements for specific models with adjustable quantization and batch size parameters. These are more granular but require you to already know which model you want to check — they don’t give you the full landscape view.

llmfit and llm-checker are CLI tools that scan your hardware and score models across multiple dimensions (quality, speed, fit, context). llm-checker even integrates with Ollama for direct model management. These are more powerful for technical users but have a higher setup barrier.

Where CanIRun.ai wins is the zero-friction entry point: open a browser tab, get instant results, no installation required. Where it loses is in the precision that more specialized tools offer, particularly around MoE models, partial offloading, and real-world inference optimization. The privacy angle — zero data leaving your machine — is a genuine differentiator too, given that some competing tools require you to share hardware specs with a server.

The Bigger Picture: Why This Tool Resonated

The 685-point Hacker News response wasn’t just about one tool. It reflected a growing frustration in the local AI community. Model releases have outpaced tooling. Every week brings a new 7B, 32B, or 70B model, each with different architecture quirks, quantization options, and hardware requirements. The gap between “this model exists” and “this model works on my machine” is filled with trial, error, Reddit threads, and guesswork.

CanIRun.ai simplified that to a single browser visit. It’s the kind of utility that should have existed earlier — similar in spirit to the classic “Can You RUN It?” (from System Requirements Lab) that PC gamers have used for years to check game compatibility. The AI equivalent was overdue.

The tool was built for the local AI community and promoted by Miguel Ángel Durán (midudev), a well-known Spanish developer and content creator with a large following across YouTube, Threads, and X. His Threads post sharing the tool also contributed to its viral spread beyond the English-speaking developer community.

At a practical level, local AI is reaching an inflection point. Apple Silicon machines with 64-128GB unified memory can now handle models that were server-only a year ago. NVIDIA’s RTX 5090 pushes consumer-grade bandwidth to 1,792 GB/s. AMD’s Ryzen AI Max+ lineup adds another option. The hardware is there — what was missing was a dead-simple way to match it to the right model. That’s the gap CanIRun.ai fills, even if imperfectly.

FAQ

Is CanIRun.ai free?
Yes, completely free with no account required. The tool runs in your browser using WebGL APIs to detect hardware. No data is sent to any server, so there are no privacy concerns or usage limits.

How accurate are the CanIRun.ai ratings?
The ratings are useful as a general guide but tend to be conservative. Community feedback on Hacker News indicates the tool sometimes underestimates what hardware can handle, particularly for Mixture-of-Experts models and setups using optimized inference engines like llama.cpp with partial GPU offloading. Think of it as a floor estimate rather than a ceiling.

What GPUs and chips does CanIRun.ai support?
The built-in database covers approximately 40 GPUs from NVIDIA, AMD, and Intel, plus 12 Apple Silicon chips. This includes popular consumer cards like the RTX 4060, 4090, and 5090, as well as M-series chips from M1 through M4. If your specific GPU isn’t in the database, the tool may not return accurate results.

How does CanIRun.ai compare to using Ollama or LM Studio directly?
Ollama and LM Studio are inference runtimes — you need to download models to find out if they run well. CanIRun.ai answers the “should I even bother?” question before you commit bandwidth and storage. They’re complementary: use CanIRun.ai to identify candidates, then download and run them in your preferred local runtime.

Does CanIRun.ai work for models not in its catalog?
The web tool only evaluates models in its built-in catalog. For Hugging Face models not listed on the site, the Python CLI tool (pip install canirun) can estimate requirements for any model hosted on the Hub, as long as it uses a supported architecture (Llama, Mistral, Gemma, BERT, and experimental MoE support).

Top AI Product