AI Platform

What does inference cost and what platform do I build on?

← All topics · Subscribe by email · RSS feed · llms slice (14d)

AI Platform — June 10, 2026

9 Jun, 2026

DeepSeek V4 pricing triggers China-wide AI API price war — Tencent Cloud cuts DeepSeek-V4 hosting 97.5%, Xiaomi cuts MiMo-V2.5 99%; Google's GKE Inferen…

pricing · china · deepseek · tencent-cloud · xiaomi · routing
AI Platform — June 9, 2026

8 Jun, 2026

Cerebras positions Kimi K2.6 at 981 tok/s output — 5.4× faster than Gemini 3.5 Flash with half the TTFT; Google Gemini 2.0 Flash permanently shut down J…

routing · latency · throughput · cerebras · kimi · benchmarks
AI Platform — June 8, 2026

7 Jun, 2026

DigitalOcean ships prefix-aware routing and incoming cached-token pricing, claims up to 4x lower effective compute cost; Anthropic moves Claude Agent SD…

prefix-caching · kv-cache · routing · pricing · billing · agent-sdk
AI Platform — June 7, 2026

6 Jun, 2026

vLLM Semantic Router v0.3 Themis ships SAAR stateful routing with RouterArena #1 ranking at $0.11/1K queries; DigitalOcean Inference Gateway ships prefi…

routing · vllm · agentic · saar · open-source · latency
AI Platform — June 6, 2026

5 Jun, 2026

DigitalOcean Inference Gateway ships prefix-aware routing with 75%+ cache hit rates; GitHub Copilot switches all plans to usage-based AI Credits billing…

prefix-caching · routing · vllm · cost-optimization · pricing · github-copilot
Builder Tooling — June 6, 2026

5 Jun, 2026

Vercel Sandbox Drives add persistent attachable storage for agent workspaces; skills.sh API launches with Vercel OIDC auth for querying 600k+ open-sourc…

vercel-sandbox · persistent-storage · agent-workspace · private-beta · vercel · skills-api
Inference Economics — June 6, 2026

5 Jun, 2026

DigitalOcean Inference Gateway ships prefix-aware routing with 75%+ cache hit rates; GitHub Copilot switches all plans to usage-based AI Credits billing…

prefix-caching · routing · vllm · cost-optimization · pricing · github-copilot

All topics →

AI Platform

AI Platform — June 10, 2026

AI Platform — June 9, 2026

AI Platform — June 8, 2026

AI Platform — June 7, 2026

AI Platform — June 6, 2026

Builder Tooling — June 6, 2026

Inference Economics — June 6, 2026