AI Developer Tools & SDKs
-
Web Speed kills the token tax for Claude and Gemini agents, claims 90% cost cut
Production web agents have a dirty secret. Most of their token budget goes to parsing bloated HTML — scripts, tracking pixels, ad divs — before the model sees the actual content. Web Speed, launched on Product Hunt on May 9 by Dominic Pi-Sunyer, attacks this head-on: a logic layer between the agent and the web… Continue reading
-
Entire (by Thomas Dohmke) + Checkpoints: $60M seed bets Git wasn’t built for AI agents
Thomas Dohmke left GitHub in August 2025. Six months later he’s back with Entire — $60M seed at a $300M valuation, the largest dev-tool seed on record. Felicis led, with Madrona, M12, Basis Set, Jerry Yang, and Datadog’s Olivier Pomel piling in. What Entire actually is A new dev platform built around one thesis: Git… Continue reading
-
Prime Intellect Lab hits GA: per-token RL training across 14 models
Prime Intellect flipped Lab from beta to GA on May 7. It’s a full-stack platform for training self-improving agents — define a task, write a harness, evaluate, run RL on the reward signal, inspect rollouts, deploy a LoRA adapter, serve inference. All inside one product. Nobody else has commercialized this loop end-to-end; until now you… Continue reading
-
ByteDance UI-TARS-desktop scores 61.6% on ScreenSpot Pro, leaving GPT-4o and Claude behind
ByteDance’s UI-TARS-desktop pulled 656 stars yesterday — sitting at 31.9k on GitHub, and everyone’s calling it the open-source answer to OpenAI’s Operator. What it actually is: two pieces. Agent TARS, a multimodal agent stack you run in a terminal, browser, or embed in your product. And UI-TARS Desktop, a native app that hands your computer… Continue reading
-
Airbyte Agents (Context Store) launches with 50 connectors and 75-90% token savings vs vendor MCPs
Airbyte spent eight years moving data into warehouses. Now they’re moving it into agents. Airbyte Agents is a managed Context Store that pre-replicates and pre-indexes data from Salesforce, Zendesk, Slack, Linear, Jira, Gong and 50+ other SaaS sources. Instead of an agent hitting a vendor MCP and burning thousands of tokens parsing raw API responses,… Continue reading
-
Google ships Gemini API File Search Multimodal RAG — page-level citations and Embedding 2 baked in
Google extended its Gemini API File Search tool with the three things every production RAG team has been begging for: native multimodal indexing, custom metadata filtering at query time, and page-level citations. PDFs, scientific imagery, and plain text now live in one searchable index, powered by Gemini Embedding 2. The post hit the Hacker News… Continue reading
-
Anthropic Claude Dreaming lets agents rewrite their own memory — Harvey saw 6x task completion
Anthropic shipped Dreaming on May 6, a research preview inside Claude Managed Agents. The pitch is blunt: stop letting your agents repeat the same mistake forever. What Dreaming actually does Between live sessions, the agent goes async and chews through its own past — transcripts plus the memory store. It pulls out patterns that survive… Continue reading
-
Gemini 3.1 Flash-Lite hits GA: $0.25/M input tokens, 2.5x faster TTFT
Google pushed Gemini 3.1 Flash-Lite to General Availability on May 7. It’s the cheapest, fastest model in the Gemini 3 family — and the most interesting one for anyone running real production traffic. What it is A lightweight LLM API, not a consumer product. Pricing is $0.25 per million input tokens and $1.50 per million… Continue reading
-
agentmemory (rohitg00) hits 95.2% on LongMemEval — 27 points above Mem0
Coding agents forget. Close Cursor, reopen tomorrow, re-explain the project. agentmemory fixes that — a TypeScript memory layer climbing GitHub trending fast: +518 stars yesterday, sitting at 3,196. What it actually is A persistent memory server that drops into Claude Code, Cursor, Gemini CLI, OpenCode, or any MCP client. All agents share one memory. It… Continue reading
-
Mojo 1.0 Beta lands after 3 years: one codebase from CPU to GPU, no CUDA
Modular shipped Mojo 1.0.0b1 on May 7 — the first 1.0-track release of the AI-native language Chris Lattner has been building since 2023. HackerNews put it on the front page within hours: 239 points, 164 comments. Lattner is the same person behind LLVM, Clang, and Swift, so the AI infra crowd actually listens when he… Continue reading
