Builder's Daily / Agent Security
Agent Security — June 6, 2026
What must I implement to run agents safely?
- runtime-containment
- policy-enforcement
- windows
- agent-identity
- prompt-injection
- ci-cd
The read
Cheaper agents mean more attack surface. Security moves from policy decks to runtime guardrails — humans still define what must never happen; the harness enforces it.
What moved
-
Microsoft MXC SDK enforces policy-driven agent containment on Windows and WSL — Windows Developer Blog At Build 2026 Microsoft previewed the Microsoft Execution Containers (MXC) SDK, a cross-platform policy layer that maps developer-defined constraints to isolation primitives at runtime. Early preview ships process isolation (GitHub Copilot CLI adopted it for model-generated code) and session isolation with distinct Entra-backed agent identities; Agent 365 plus Intune/Entra apply per-agent policy. Roadmap adds micro-VM and WSL Linux containers, and Windows Defender scans for prompt injection on the endpoint. Builder angle: Agent harnesses can delegate filesystem and network bounds to MXC instead of inheriting the full user session, with IT pushing the same policy model through Entra and Intune.
-
Microsoft documents Claude Code GitHub Action secret exfiltration via Read tool bypass — Microsoft Security Blog Microsoft Threat Intelligence found prompt injection in GitHub issue/PR content could steer Claude Code Action to read /proc/self/environ via the in-process Read tool, bypassing Bubblewrap env scrubbing used for Bash and leaking ANTHROPIC_API_KEY. Anthropic patched in Claude Code 2.1.128 by blocking sensitive /proc paths. Microsoft recommends the Agents Rule of Two: never combine untrusted input, secret/tool access, and external write channels in one workflow. Builder angle: CI agents that ingest repo issues must route all file reads through the same scrubbed subprocess boundary as shell tools, and split triage workflows from token-bearing tag-mode runs.
-
Microsoft AI Red Team publishes agentic failure-mode taxonomy v2.0 with seven new categories — Microsoft Security Blog After 12 months of red-team engagements Microsoft updated its agentic AI failure-mode taxonomy with seven new categories: agentic supply chain compromise, goal hijacking, inter-agent trust escalation, computer-use visual attacks, session context contamination, MCP/plugin abuse, and capability disclosure. Operational data shows HitL bypass and XPIA-plus-memory-poisoning chains at high frequency. New mitigations prescribe agent SBOMs including MCP tool descriptions, cryptographic inter-agent identity, consent-architecture hardening, and adversarial session context tracking. Builder angle: Use the v2.0 matrix as a red-team checklist—especially MCP tool-description poisoning, session contamination, and capability disclosure before shipping production agents.
Also tracking
- Cisco AI Defense adds adaptive red teaming and Policy Studio natural-language guardrails — source — Per-agent adaptive red-team objectives and NL Policy Studio guardrails; CI/CD CLI discovers agent dependency graphs including MCP servers and skills.
- SafeMCP open-source plugin filters hazardous MCP tools via look-ahead world model — source — BAAI/PKU server-side MCP defense proactively prunes tool sets and fail-safe blocks unsafe calls; code at github.com/wlc2424762917/SafeMCP.