AI Coding Tools
-
OpenAI Codex in Chrome moves the coding agent into your real browser session
OpenAI just put Codex inside Chrome. The extension runs on macOS and Windows, driving the same tabs, cookies, and signed-in accounts you already have open — no sandbox, no screen takeover. 4 million weekly active users, up 8x since January 2026. Codex went from neat IDE thing to OpenAI’s fastest-growing product in five months. What… Continue reading
-
AlphaEvolve impact update: 32.5% FlashAttention speedup and a 56-year matrix record in one year
A year after DeepMind shipped AlphaEvolve, they put up the scorecard. 32.5% speedup on FlashAttention kernels. 4×4 complex matrix multiplication in 48 scalar multiplications — first time anyone beat Strassen since 1969. 0.7% of Google’s worldwide compute recovered daily. 10x lower error on Willow quantum circuits. Not a demo. What it actually is An evolutionary… Continue reading
-
AlphaEvolve year one: 32.5% faster FlashAttention, a fallen Strassen record, now on Google Cloud
Google DeepMind quietly let AlphaEvolve cook inside Google for a year. The receipts just dropped. This isn’t a copilot for autocompleting your for-loops — it’s an evolutionary coding agent that mutates whole codebases, runs them, scores them, and keeps the winners. What it actually did in 12 months A 32.5% speedup on the FlashAttention kernel.… Continue reading
-
Tilde.run hits 162 on Show HN with a ‘Git for agent runs’ sandbox from the lakeFS team
Coding agents have one nasty habit: they nuke your working directory. Half-applied edits, deleted files, an rm -rf nobody asked for. Tilde.run hit 162 points on Show HN today by treating that as a database problem — every agent run is a transaction. Clean exit commits, crash rolls back, nothing silently overwritten. What it actually… Continue reading
-
Claude Opus 4.7 goes Wall Street first — 64.4% on Vals Finance, 1M context in Claude Code
Anthropic didn’t drop Claude Opus 4.7 with a blog post. The new flagship model went to a closed-door bank briefing in New York on May 5, alongside a Moody’s data pipe and full Microsoft 365 integration. The message: this model is built for whoever pays the most. The numbers back it up. 64.4% on Vals… Continue reading
-
DeepSeek-TUI tops GitHub Trending: a Claude Code clone wired to DeepSeek’s API
DeepSeek-TUI hit #1 on GitHub Trending today. 2,389 stars in 24 hours. It’s a Rust-based terminal coding agent built specifically for DeepSeek models — sitting in your shell like Claude Code or Codex CLI, but routing every call to DeepSeek V4 / DeepSeek-Coder through the official API. What you actually get A TUI that reads… Continue reading
-
OpenAI Codex Pets turn your AI coding agent into a desktop Tamagotchi
OpenAI’s 38th product isn’t a model. It’s eight cartoon pets that live on your screen and tell you what Codex is doing. What it does Codex Pets is a floating animated overlay for the Codex coding app on Windows and macOS. The pet idle-animates while the agent runs, shifts when it’s waiting on input, and… Continue reading
-
OpenAI Codex CLAUDE.md auto-import makes switching from Claude Code a two-click move
OpenAI just made the most surgical attack yet in the coding-agent war. The Codex desktop app now auto-detects and imports your CLAUDE.md file along with other competing-agent configs at setup. Two clicks and your project is moved. What it actually does Codex is OpenAI’s coding agent — the direct rival to Anthropic’s Claude Code. CLAUDE.md… Continue reading
-
DeepClaude lets Claude Code run on DeepSeek V4 Pro — $0.87 vs $15 per million tokens
DeepClaude is sitting at 467 points and 179 comments on Hacker News today, and the GitHub repo has crossed 540 stars in a few days. The pitch is one line: keep Claude Code’s agent loop, swap Anthropic’s models for DeepSeek V4 Pro. The bill drops about 17x. What it actually is A self-hosted proxy. You… Continue reading
-
GPT-5.5 takes back the coding crown from Claude Opus 4.7
OpenAI shipped GPT-5.5 on April 23, 2026, with a Pro variant a day later. Biggest model bump in six weeks, and the benchmarks aren’t subtle: 84.9% on GDPval, 78.7% on OSWorld-Verified, 98.0% on Tau2-bench Telecom (no prompt tuning), 82.7% on Terminal-Bench 2.0, 51.7% on FrontierMath 1-3. Translation: best-in-class coding, computer use, and agent workflows —… Continue reading
