Microsoft open-sourced two tools aimed at the unglamorous side of building agents: knowing whether they’re safe, and whether you should build them at all. RAMPART handles the first; Clarity handles the second.
## Red-team findings that don’t evaporate
RAMPART is an agent test framework that lets you encode adversarial and benign scenarios as repeatable tests you can run in CI. The point is durability: a red-team finding or a real AI incident usually gets fixed once and forgotten. RAMPART turns those into lasting regression coverage, so the same failure can’t quietly come back three releases later. It treats agent safety the way mature software treats bugs — write a failing test, fix it, keep the test forever.
## Deciding what to build
Clarity is the upstream tool: a structured sounding board that helps a team work out whether they’re building the right thing before writing a line of code. For agentic projects, where it’s easy to ship an impressive demo that solves the wrong problem, that pre-build gut-check has real value.
## Why it matters
Agent reliability is moving from vibes to engineering discipline. CI-based regression testing for adversarial behaviour is exactly the kind of boring infrastructure that decides whether agents are safe to ship — and open-sourcing it pushes the practice toward becoming a default, not a luxury.

Leave a comment