AI Cybersecurity
-
Microsoft MDASH scores 88.45% on CyberGym, beating Anthropic Mythos and OpenAI GPT-5.5
Microsoft just put a number on the “agent swarm vs single super-model” debate, and the swarm won. MDASH — short for multi-model agentic scanning harness — hit 88.45% on the public CyberGym benchmark, about five points ahead of Anthropic’s Mythos (83.1%) and OpenAI’s GPT-5.5 (81.8%). What MDASH actually is Not a model. A cybersecurity agent… Continue reading
-
OpenAI Daybreak ships with five top security vendors — a direct shot at Claude Mythos
OpenAI dropped Daybreak on May 12 with Cisco, Cloudflare, CrowdStrike, Palo Alto Networks, and Zscaler signed on as launch partners. That lineup is the actual news. The product is OpenAI’s first dedicated cybersecurity platform — a Codex-Security-based agent that reads your repo, builds an editable threat model, and runs the exploits in a sandbox to… Continue reading
-
Anthropic Claude Security goes public beta after 500+ CVEs found in closed preview
Anthropic just moved Claude Security from closed preview to public beta. Claude Enterprise gets it first, Team and Max next. Opus 4.7 reads your entire codebase, traces data flows across files and modules, and tells you exactly where you’re bleeding. What the agent actually does It lives inside Claude Code on the web and runs… Continue reading
-
GPT-5.5-Cyber: OpenAI forks a security model with looser guardrails for vetted red teams
OpenAI shipped GPT-5.5-Cyber on May 7, 2026 — a fork of GPT-5.5 with the cybersecurity guardrails dialed back. Vetted defenders can have it write proof-of-concept exploits, run attack simulations, and validate vulnerabilities — work that gets a polite refusal in standard ChatGPT. How the split works Two tracks, one model family. Standard GPT-5.5 stays the… Continue reading
-
OpenAI Trusted Access for Cyber opens GPT-5.5 to offensive security work — for verified defenders only
OpenAI is splitting its safety stack. Trusted Access for Cyber is a verified-user tier that unlocks GPT-5.5’s offensive security capabilities — vulnerability research, exploit chain reasoning, red-team payload work — for vetted defenders. Codex is the first surface to ship it. First time a frontier lab has formalized a cyber-permissive track. Vetted users get a… Continue reading
