---
title: "Agent Reliability — June 10, 2026"
description: "Perplexity launches 'Search as Code': agents write Python to compose retrieval, rerank, and dedup primitives directly; LlamaParse adds word/line/cell-le…"
canonical_url: "https://artificialcuriositylabs.ai/daily/agent-reliability/2026-06-10/"
md_url: "https://artificialcuriositylabs.ai/daily/agent-reliability/2026-06-10.md"
published_at: "2026-06-10T00:00:00.000Z"
beat: "agent-reliability"
topics:
  - "retrieval-architecture"
  - "agentic-search"
  - "reranking"
  - "cost"
  - "ingestion"
  - "parsing"
---

## The read

You cannot run what you cannot see. Grounding is institutional context encoded in retrieval; observability is electricity-metering for agent loops; security moves from policy decks to runtime guardrails. Reliability is the stack between "it works in demo" and "it runs in prod."

## What moved

- **Perplexity launches 'Search as Code': agents write Python to compose retrieval, rerank, and dedup primitives directly** — [Perplexity Research](https://research.perplexity.ai/articles/rethinking-search-as-code-generation)
  Perplexity replaced its sequential function-calling search loop with 'Search as Code' (SaC): models generate task-specific Python that runs in sandboxes and calls an Agentic Search SDK exposing atomic primitives (retrieval, ranking, filtering, deduplication). On a CVE-advisory task this cut token usage 85% (288.7K to 42.9K tokens), and SaC scored +29% on DSQA and +45% on a new WANDR benchmark, with medium-reasoning SaC beating all non-SaC systems at under $1/task. Rolling out now in Perplexity Computer and the Agent API. **Builder angle:** Builders get composable, code-level retrieval/rerank/dedup primitives instead of fixed search endpoints, enabling per-task retrieval strategies at a fraction of the token cost of loop-based agentic search.

- **LlamaParse adds word/line/cell-level bounding boxes for audit-grade citation grounding** — [LlamaIndex Blog](https://www.llamaindex.ai/blog/announcing-granular-bounding-boxes-in-llamaparse)
  LlamaParse now supports an opt-in `output_options.granular_bboxes` parameter to return word-, line-, or cell-level coordinates instead of coarse layout-level boxes. The system applies coordinates only to text explicitly present on the page (not inferred values or AI summaries), enabling exact-location citations for dense documents like financial filings and tables. Available across paid tiers, with Agentic Plus adding extra verification passes. **Builder angle:** RAG pipelines can now ground citations to a specific word or table cell rather than highlighting a whole page or paragraph, closing a gap for compliance and financial-document agents that need audit-grade provenance.

- **Arize: Microsoft's open trust stack makes OpenInference the shared trace contract linking ASSERT evals, ACS runtime controls, and Phoenix/Arize AX** — [Arize Blog](https://arize.com/blog/microsoft-open-trust-stack-openinference/)
  At Build 2026 Microsoft introduced ASSERT (MIT-licensed, spec-driven agent evaluation and regression-testing framework that turns behavior specs into test cases and graded traces) and Agent Control Specification (ACS), a portable runtime-guardrail standard with checkpoints at input, LLM call, state, tool execution, and output. Both standardize on OpenInference, the OpenTelemetry-for-AI standard Arize created (33+ framework integrations, two-line instrumentation): ASSERT reads OpenInference spans as judge evidence, ACS emits its control decisions as spans, and the same trace stream feeds Phoenix or Arize AX for production monitoring. **Builder angle:** One OpenInference instrumentation pass now feeds CI eval gates (ASSERT), runtime guardrails (ACS), and production observability (Phoenix/Arize AX) without separate re-instrumentation per tool.

## Also tracking

- **Sedai launches autonomous AI Agent Optimization platform with real-time per-team/per-model cost attribution and AI-judge-based routing** — [source](https://cioinfluence.com/machine-learning/sedai-launches-the-first-autonomous-platform-for-ai-agent-optimization/) — Drop-in layer for per-team/per-model token-cost attribution and automated cost-aware model routing across providers without re-instrumenting agent code.
- **Zscaler launches AI Broker, AI Access Graph, and Endpoint AI Security to govern agent identity and MCP/A2A traffic** — [source](https://www.zscaler.com/press/zscaler-unveils-new-product-innovations-secure-agentic-ai) — Gives a concrete pattern for scoping which MCP/A2A tools an agent can reach per identity and tracking data lineage in real time — a deployable access-control and audit layer for agent fleets.