So here’s something wild that just blew up on GitHub — a developer named Manjeet Singh went ahead and reverse-engineered Apple’s Neural Engine to make it do something Apple never intended: train neural networks. The project is called [ANE](https://github.com/maderix/ANE), and it’s sitting at around 5k stars after hitting [GitHub Trending #11 on trendshift.io](https://trendshift.io). For good reason.
If you’re not familiar, the Apple Neural Engine is that dedicated AI accelerator baked into every Apple Silicon chip. It’s capable of up to 15.8 TFLOPS on the M4, but Apple has always kept it locked down for inference only — you run models on it, you don’t train them there. CoreML handles the high-level stuff and there’s no public API that lets you do backpropagation on ANE hardware. Well, there wasn’t.
What Manjeet did is bypass CoreML entirely and talk directly to the ANE through private, undocumented APIs — `_ANEClient`, `_ANECompiler`, `_ANEInMemoryModelDescriptor` — discovered via runtime introspection. The whole thing is written in Objective-C, and he mapped the entire software stack from CoreML all the way down to the IOKit kernel driver. He documented the journey in a fantastic two-part series on [his Substack](https://maderix.substack.com/p/inside-the-m4-apple-neural-engine), which also sparked a lively [Hacker News thread](https://news.ycombinator.com/item?id=47208573).
The results so far are honest and refreshing. He managed to train a 109M-parameter Llama2-architecture transformer directly on ANE, getting around 91 ms/step on M3 Ultra and 106 ms/step on M4. But he’s upfront that utilization is still low — roughly 5-9% of peak — because many element-wise operations fall back to CPU. It works, but there’s a long road to making it practical.
What makes this project interesting isn’t just the technical feat. It’s that it proves on-device ML training on Apple hardware is physically possible without Apple’s blessing. That opens up a whole conversation about what Apple could (or should) expose to developers. The [HN discussion](https://news.ycombinator.com/item?id=47221528) got into exactly that territory, with people debating everything from API stability to whether Apple might eventually offer official training support.
Worth noting: this is independent research done for educational purposes, not affiliated with Apple. The private APIs used could break with any macOS update. But as a proof of concept and a deep dive into how Apple Silicon actually works under the hood, it’s one of the most impressive hardware hacking projects to show up on GitHub in a while.

Leave a comment