DeepTutor hit 11K GitHub stars — HKU built what Google NotebookLM should have been

Upload a textbook. Get a personal AI tutor that actually understands the material, generates quizzes, builds knowledge graphs, and reasons through problems step by step — with citations pointing back to the exact page.

That’s DeepTutor. Built by HKUDS, the Data Intelligence Lab at the University of Hong Kong, led by Chao Huang. Open-source. Self-hosted. No subscription, no data leaving your machine.

Five days after launch, it crossed 1,400 GitHub stars. Now it’s sitting at 11.3K, ranked #15 on Trendshift with a score of 3,230. In a space dominated by closed platforms like Google NotebookLM and Khan Academy’s Khanmigo, an open-source project pulling this kind of traction says something about what people actually want.

Four Agents, Two Loops, One Answer

Most AI tutoring tools are glorified chatbots with a RAG pipeline bolted on. Ask a hard question, get a mediocre answer stitched together from the top-3 retrieved chunks. DeepTutor takes a different approach — it splits reasoning into two distinct loops.

The Analysis Loop runs an InvestigateAgent that digs through your uploaded documents, paired with a NoteAgent that tracks what’s been found and what’s still missing. Once the investigation is done, the Solve Loop kicks in — a PlanAgent designs the solution strategy, and a ManagerAgent coordinates execution.

This isn’t just architectural elegance for its own sake. The dual-loop design means DeepTutor can handle multi-hop questions — the kind where the answer requires connecting facts from chapter 3 and chapter 17 — that single-pass RAG systems routinely fumble. Every step comes with precise citations back to your source material, so you can verify instead of trust.

The retrieval layer combines standard RAG with LightRAG-powered knowledge graphs. Upload a 500-page textbook, and DeepTutor doesn’t just chunk and embed it — it extracts entities and relationships, maps semantic connections, and builds a navigable graph. That graph becomes a visual study aid on its own, showing you how concepts relate in ways a table of contents never could.

TutorBots: Agents That Remember You

Here’s where DeepTutor gets genuinely interesting. TutorBots aren’t chat sessions — they’re persistent, multi-instance agents. Each one runs its own agent loop with an independent workspace, memory, and personality. You can spin up a bot for organic chemistry, another for linear algebra, and a third for your thesis literature review. They don’t share context unless you want them to.

Each TutorBot learns your knowledge gaps over time. Ask the same concept three different ways, and it adjusts its explanations. It generates custom quizzes calibrated to what you’ve struggled with, not generic practice sets. It can mimic real exam styles — if you’re prepping for a specific course, it builds questions that match the format and difficulty you’d actually face.

The memory layer is shared across all your TutorBots and persists between sessions. Come back two weeks later, and your organic chemistry bot remembers where you left off.

How It Stacks Up Against NotebookLM and Khanmigo

Google NotebookLM is the obvious comparison. Both let you upload documents and ask questions about them. But NotebookLM is a closed platform — your data goes to Google, you get Google’s interface, and you’re limited to what Google decides to build. No self-hosting, no customization, no knowledge graph visualization.

Khanmigo, Khan Academy’s AI tutor at $9/month, leans hard on the Socratic method — it asks questions back instead of giving direct answers. Great pedagogical philosophy, but it’s locked to Khan Academy’s content ecosystem. You can’t upload your own materials and build a custom learning path from them.

ChatGPT’s Study Mode, currently the most popular AI study tool by user count, is a general-purpose model wearing a tutor costume. It doesn’t build persistent knowledge representations of your documents, doesn’t generate knowledge graphs, and doesn’t maintain long-term memory of your learning progress.

DeepTutor’s advantage is specificity. It’s not trying to be a general assistant that also tutors. It’s a multi-agent system purpose-built for learning, with document-level understanding that the general-purpose tools can’t match. The trade-off: you need to self-host it, which means Docker, an API key for your LLM of choice, and some comfort with the command line.

Who This Is Actually For

DeepTutor is CLI-native. Every feature works through commands, with structured JSON output for programmatic access and rich terminal rendering for humans. This is not a polished consumer app with onboarding flows and tooltips.

The sweet spot: graduate students drowning in papers, researchers building knowledge bases across dozens of documents, self-learners who want more than a chatbot but don’t want to pay for a platform. If you’re already running Ollama or have an OpenAI API key, setup is straightforward — Docker Compose and you’re live.

The project supports PDF, TXT, and Markdown uploads. It includes web search for supplementing your knowledge base with external information, code execution for technical subjects, and academic paper search for pulling in related research. All of it feeds back into the same dual-loop reasoning system.

11.3K stars on a self-hosted CLI tool for education is unusual. Most trending GitHub projects are developer tools or AI frameworks — things engineers build with. DeepTutor is a tool people actually use to learn. That distinction matters. It suggests the demand for open-source, privacy-respecting AI education tools is far larger than the current market reflects.

Top AI Product

Leave a comment Cancel reply