Onyx Hits 19.7K GitHub Stars — The Open-Source Answer to Glean’s $7.2B Enterprise AI Play

Most enterprise AI platforms want you to pay six figures a year and hand over your data to their cloud. Onyx, the YC-backed open-source project formerly known as Danswer, is betting the opposite approach works better: self-host everything, use any LLM you want, and keep your data behind your own firewall.

The project just crossed 19,700 GitHub stars and landed on GitHub Trending this week, fueled by a rapid-fire release cycle — v3.0.5 dropped on March 25 — and a feature set that now includes MCP support, Deep Research, and 40+ enterprise connectors.

What Onyx Actually Does

Onyx is a self-hostable AI platform that connects to your company’s documents, apps, and people. Think of it as a private ChatGPT that actually knows your organization — but instead of relying on one provider’s LLM, you pick whatever model fits your budget and privacy requirements.

The platform supports OpenAI, Anthropic Claude, Google Gemini, and self-hosted options like Ollama and vLLM. That last part matters a lot for companies operating in regulated industries or air-gapped environments where data cannot leave the premises.

At its core, Onyx does three things:

Enterprise search across all your tools. It ingests data from 40+ connectors — Slack, Confluence, Google Drive, Notion, Jira, GitHub, Salesforce, and more — and lets anyone in the company ask questions in natural language. The system respects existing document permissions, so people only see what they’re already authorized to access.

RAG with hybrid search and knowledge graphs. Rather than just doing basic vector similarity, Onyx combines hybrid search (keyword + semantic), contextual retrieval, and LLM-generated knowledge graphs. According to Onyx’s own benchmarks, this approach achieves a 64% win rate against ChatGPT and a 76% win rate against Notion AI across 220,000 workplace documents.

Custom AI agents that take actions. You can build agents with specific instructions, knowledge bases, and tool access. Through MCP (Model Context Protocol) support and native integrations, these agents can actually do things — not just answer questions but interact with your business applications.

The Feature Sprint That Put Onyx on the Map

Looking at the release history tells the story of a team shipping at startup speed. In just the last few releases:

v3.0.5 added memory features with user preferences and structured context, plus a unified search-and-chat interface
v2.12.10 introduced Onyx Craft for document generation and license-based access control
v2.11.4 brought Deep Research replay capabilities, Discord bot integration, and a desktop app
v2.10.7 added MCP custom headers support, image generation (DALL-E), and agent sharing
v2.9.9 launched a Chrome extension, multilingual Deep Research, and improved web crawling with Exa integration

The Deep Research feature is particularly noteworthy. It runs multi-step agentic searches — the kind of thing you’d normally pay Perplexity Enterprise for — but over your own internal documents. The MCP support, added in recent releases, means Onyx agents can now interact with external tools and services using the same protocol that Claude and other AI assistants use.

The codebase is 63% Python and 31% TypeScript (Next.js frontend), with 7,149 total commits and 198 open pull requests — signs of an active, fast-moving project.

Onyx vs. the Competition: Where It Fits

The enterprise AI knowledge platform space has gotten crowded fast. Here’s how Onyx stacks up.

Glean ($7.2B valuation, $208M ARR) is the closest proprietary competitor. Glean offers 100+ connectors and has proven itself at Fortune 500 scale, but contracts start around $100K-$500K per year. No self-hosting, no open source. If budget and data control matter to you, Onyx covers much of the same ground at zero software cost.

Microsoft 365 Copilot ($30/user/month) is deeply integrated into the Microsoft ecosystem, which is great if you live in that world. But it’s locked to Microsoft’s LLM choices, doesn’t work outside the M365 stack, and at 500 users, you’re looking at $180K/year just for Copilot licenses on top of existing M365 costs.

Perplexity Enterprise has strong web search capabilities but limited internal document support (500-file cap) and far fewer enterprise connectors than either Glean or Onyx.

Among open-source alternatives, Onyx occupies a unique position:

Feature	Onyx	AnythingLLM	LibreChat	PrivateGPT
Enterprise connectors	40+	Few	None	None
SSO / RBAC	Yes	No	Basic	No
Document-level permissions	Yes	No	No	No
Custom agents	Yes	Yes	Yes	No
Air-gap deployment	Yes	Yes	Yes	Yes
License	MIT (Community)	MIT	MIT	Apache 2.0

AnythingLLM is simpler to set up (desktop app, one-click install) but lacks enterprise connectors and access control — better for individuals and small teams. LibreChat is an excellent multi-provider chat UI but isn’t a knowledge platform. PrivateGPT focuses on offline RAG but has no multi-user features or enterprise integrations.

Onyx is the only open-source option that combines broad connector coverage, enterprise-grade security (SOC 2 Type II, GDPR), and the agent capabilities to actually compete with Glean-class products.

Who Should Care About Onyx

The sweet spot for Onyx is mid-size to large organizations that want enterprise AI but don’t want to write a $200K check to Glean, can’t or won’t send data to third-party clouds, or need LLM flexibility instead of vendor lock-in.

Specific scenarios where Onyx makes the most sense:

Regulated industries. Healthcare, finance, and government organizations that need AI search across internal documents but have strict data residency requirements. Onyx’s air-gap deployment and self-hosted architecture check those boxes.

Multi-tool organizations. If your company uses a mix of Google Workspace, Slack, Jira, Confluence, Salesforce, and GitHub — rather than being all-in on Microsoft — Onyx’s 40+ connectors are a better fit than M365 Copilot.

Cost-conscious enterprises. The community edition is MIT-licensed and free. The enterprise edition adds SSO, advanced RBAC, and dedicated support, but the core platform doesn’t cost anything to run beyond your own infrastructure.

Ramp, the corporate card company, is one notable user — their Director of Product Ops reported the platform answers “thousands of questions a week” with “30x ROI” compared to alternatives they tested.

Deployment: Easier Than You’d Expect

For an enterprise-grade platform, getting Onyx running is surprisingly straightforward. A single curl command handles the basic setup:

curl -fsSL https://onyx.app/install_onyx.sh | bash

For production deployments, Onyx supports Docker Compose, Kubernetes (with Helm charts), Terraform, and cloud-specific guides for AWS EKS and Azure. The documentation covers everything from single-server setups to multi-node clusters handling tens of millions of documents.

If you don’t want to self-host at all, Onyx Cloud offers a hosted version with a free trial — no credit card required.

FAQ

Is Onyx free to use?
The Community Edition is MIT-licensed and completely free. The Enterprise Edition adds features like SSO (OIDC/SAML/OAuth2), advanced role-based access control, and priority support at an undisclosed price. For most teams, the community edition covers the core functionality.

How does Onyx compare to Glean?
Glean has more connectors (100+ vs 40+) and deeper enterprise sales support, but costs $100K+ per year and requires sending your data to Glean’s cloud. Onyx offers self-hosting, LLM flexibility, and an open-source codebase at a fraction of the cost — or free for the community edition.

What LLMs does Onyx support?
Onyx works with OpenAI (GPT-4, GPT-4o), Anthropic (Claude), Google (Gemini), and self-hosted models through Ollama and vLLM. You can switch models at any time without migrating data.

Can Onyx run completely offline?
Yes. Onyx can be deployed in fully air-gapped environments with no internet access, using self-hosted LLMs like Ollama. This makes it suitable for classified or highly regulated environments.

What’s the difference between Onyx and Danswer?
They’re the same project. Danswer rebranded to Onyx but the codebase, team, and mission remain the same. The GitHub repository is at onyx-dot-app/onyx.

Top AI Product

Leave a comment Cancel reply