Skip to content
Go back

Builder's Daily / Agent Security

Agent Security — June 6, 2026

What must I implement to run agents safely?

The read

Cheaper agents mean more attack surface. Security moves from policy decks to runtime guardrails — humans still define what must never happen; the harness enforces it.

What moved

  • Microsoft MXC SDK enforces policy-driven agent containment on Windows and WSLWindows Developer Blog At Build 2026 Microsoft previewed the Microsoft Execution Containers (MXC) SDK, a cross-platform policy layer that maps developer-defined constraints to isolation primitives at runtime. Early preview ships process isolation (GitHub Copilot CLI adopted it for model-generated code) and session isolation with distinct Entra-backed agent identities; Agent 365 plus Intune/Entra apply per-agent policy. Roadmap adds micro-VM and WSL Linux containers, and Windows Defender scans for prompt injection on the endpoint. Builder angle: Agent harnesses can delegate filesystem and network bounds to MXC instead of inheriting the full user session, with IT pushing the same policy model through Entra and Intune.

  • Microsoft documents Claude Code GitHub Action secret exfiltration via Read tool bypassMicrosoft Security Blog Microsoft Threat Intelligence found prompt injection in GitHub issue/PR content could steer Claude Code Action to read /proc/self/environ via the in-process Read tool, bypassing Bubblewrap env scrubbing used for Bash and leaking ANTHROPIC_API_KEY. Anthropic patched in Claude Code 2.1.128 by blocking sensitive /proc paths. Microsoft recommends the Agents Rule of Two: never combine untrusted input, secret/tool access, and external write channels in one workflow. Builder angle: CI agents that ingest repo issues must route all file reads through the same scrubbed subprocess boundary as shell tools, and split triage workflows from token-bearing tag-mode runs.

  • Microsoft AI Red Team publishes agentic failure-mode taxonomy v2.0 with seven new categoriesMicrosoft Security Blog After 12 months of red-team engagements Microsoft updated its agentic AI failure-mode taxonomy with seven new categories: agentic supply chain compromise, goal hijacking, inter-agent trust escalation, computer-use visual attacks, session context contamination, MCP/plugin abuse, and capability disclosure. Operational data shows HitL bypass and XPIA-plus-memory-poisoning chains at high frequency. New mitigations prescribe agent SBOMs including MCP tool descriptions, cryptographic inter-agent identity, consent-architecture hardening, and adversarial session context tracking. Builder angle: Use the v2.0 matrix as a red-team checklist—especially MCP tool-description poisoning, session contamination, and capability disclosure before shipping production agents.

Also tracking

  • Cisco AI Defense adds adaptive red teaming and Policy Studio natural-language guardrailssource — Per-agent adaptive red-team objectives and NL Policy Studio guardrails; CI/CD CLI discovers agent dependency graphs including MCP servers and skills.
  • SafeMCP open-source plugin filters hazardous MCP tools via look-ahead world modelsource — BAAI/PKU server-side MCP defense proactively prunes tool sets and fail-safe blocks unsafe calls; code at github.com/wlc2424762917/SafeMCP.
Share this post on: