Agent Security — June 6, 2026

5 Jun, 2026

runtime-containment
policy-enforcement
windows
agent-identity
prompt-injection
ci-cd

The read

Cheaper agents mean more attack surface. Security moves from policy decks to runtime guardrails — humans still define what must never happen; the harness enforces it.

What moved

Microsoft MXC SDK enforces policy-driven agent containment on Windows and WSL — Windows Developer Blog At Build 2026 Microsoft previewed the Microsoft Execution Containers (MXC) SDK, a cross-platform policy layer that maps developer-defined constraints to isolation primitives at runtime. Early preview ships process isolation (GitHub Copilot CLI adopted it for model-generated code) and session isolation with distinct Entra-backed agent identities; Agent 365 plus Intune/Entra apply per-agent policy. Roadmap adds micro-VM and WSL Linux containers, and Windows Defender scans for prompt injection on the endpoint. Builder angle: Agent harnesses can delegate filesystem and network bounds to MXC instead of inheriting the full user session, with IT pushing the same policy model through Entra and Intune.
Microsoft documents Claude Code GitHub Action secret exfiltration via Read tool bypass — Microsoft Security Blog Microsoft Threat Intelligence found prompt injection in GitHub issue/PR content could steer Claude Code Action to read /proc/self/environ via the in-process Read tool, bypassing Bubblewrap env scrubbing used for Bash and leaking ANTHROPIC_API_KEY. Anthropic patched in Claude Code 2.1.128 by blocking sensitive /proc paths. Microsoft recommends the Agents Rule of Two: never combine untrusted input, secret/tool access, and external write channels in one workflow. Builder angle: CI agents that ingest repo issues must route all file reads through the same scrubbed subprocess boundary as shell tools, and split triage workflows from token-bearing tag-mode runs.
Microsoft AI Red Team publishes agentic failure-mode taxonomy v2.0 with seven new categories — Microsoft Security Blog After 12 months of red-team engagements Microsoft updated its agentic AI failure-mode taxonomy with seven new categories: agentic supply chain compromise, goal hijacking, inter-agent trust escalation, computer-use visual attacks, session context contamination, MCP/plugin abuse, and capability disclosure. Operational data shows HitL bypass and XPIA-plus-memory-poisoning chains at high frequency. New mitigations prescribe agent SBOMs including MCP tool descriptions, cryptographic inter-agent identity, consent-architecture hardening, and adversarial session context tracking. Builder angle: Use the v2.0 matrix as a red-team checklist—especially MCP tool-description poisoning, session contamination, and capability disclosure before shipping production agents.

Also tracking

Cisco AI Defense adds adaptive red teaming and Policy Studio natural-language guardrails — source — Per-agent adaptive red-team objectives and NL Policy Studio guardrails; CI/CD CLI discovers agent dependency graphs including MCP servers and skills.
SafeMCP open-source plugin filters hazardous MCP tools via look-ahead world model — source — BAAI/PKU server-side MCP defense proactively prunes tool sets and fail-safe blocks unsafe calls; code at github.com/wlc2424762917/SafeMCP.