Volume 1, No. 64 Wednesday, May 6, 2026 AI News Daily

The AI Dispatch

“All the AI News That’s Fit to Compile”


Breaking · 11:42 AM PT

Anthropic rents all of SpaceX’s Colossus supercluster — $15 billion annual — at its Code with Claude developer conference in San Francisco. A side clause commits both companies to multi-gigawatt orbital AI compute.

Compute

Anthropic Takes All of SpaceX’s Colossus for $15B/Year, Plans Orbital Compute

Full access to 300+ MW of Colossus 1 in Memphis — more than 220,000 NVIDIA H100/H200/GB200 GPUs — plus an unusual side clause to explore multiple gigawatts of space-based AI infrastructure.

Anthropic and SpaceX announced Wednesday morning a compute partnership of a scale and structure that has no clean precedent in the AI industry. Under the agreement — revealed on the keynote stage at Anthropic’s Code with Claude developer conference in San Francisco — Anthropic takes full access to all 300-plus megawatts of the Colossus 1 supercluster in Memphis, comprising more than 220,000 NVIDIA H100, H200, and GB200 GPUs, with a contractual path to expand into Colossus 2 as that facility comes online. The headline number on the deal is $15 billion annually.

The deal is structured as an exclusive multi-year compute lease rather than an equity transaction, but its consequences for the industry’s compute league table are immediate. With Colossus 1 in hand, Anthropic’s effective training and inference capacity in the first half of 2026 jumps past Microsoft’s Azure-Anthropic allocation and approaches the combined frontier-model capacity available to xAI itself — a striking inversion given that the Memphis facility was originally built to train xAI’s Grok models. SpaceX, which assumed operational control of the cluster during Elon Musk’s 2025 restructuring of his AI compute holdings, becomes a pure infrastructure landlord for Anthropic for the duration of the contract.

The capacity windfall is being passed to subscribers immediately. Anthropic confirmed that usage caps on Claude Pro and Claude Max plans are lifted effective today, with priority queue access for enterprise tiers receiving the largest share of the new compute pool. Pro users who had been throttled during peak hours over the past several weeks — a recurring complaint following the Claude Mythos launch — should see those caps disappear by end of day.

The most unusual element of the agreement is buried in the partnership’s appendix: both companies have committed to a joint research and engineering effort exploring multi-gigawatt orbital AI compute. The language stops short of announcing a specific launch program, but the framing — “multiple gigawatts of space-based training and inference capacity within the decade” — treats orbital data centers as a near-term industrial roadmap rather than science fiction. SpaceX’s Starlink launch cadence and Anthropic’s capital position make the pairing more credible than the same announcement from any other two companies, though the engineering questions — thermal management, radiation hardening, latency to ground, and the political question of who regulates a compute cluster in orbit — remain entirely unanswered.

For the H1 2026 compute league table, the deal materially shifts the center of gravity. Microsoft’s Azure remains the largest single AI cloud, but the Anthropic-SpaceX-xAI-adjacent triangle now controls a comparable share of frontier training compute — and unlike the Microsoft-OpenAI axis, the new triangle is not bound by a single equity partnership. Anthropic CEO Dario Amodei, asked from the stage whether the deal signals a strategic distancing from the existing Amazon and Google Cloud partnerships, replied that “our compute strategy has always been additive, not exclusive.” The 10-K math will tell the rest of the story later this year.

Policy · White House

White House Floats FDA-Style Pre-Release Vetting for AI Models

National Economic Council Director Kevin Hassett told Fox Business on Wednesday morning that the White House is studying an executive order that would require AI models to pass a safety review before public release — “just like an FDA drug,” in his phrasing. The framing was specific: Hassett cited concerns that Anthropic’s Mythos model could find network vulnerabilities, and pointed to that capability as the kind of pre-release safety question a federal review would address.

Industry Pushback — and an Attempt at Reassurance

The proposal triggered an immediate backlash across the tech industry. The dominant objection: an FDA-style approval process, by design, takes years rather than weeks, and would functionally institutionalize the kind of compliance delay that frontier-model releases cannot absorb. Several venture firms and industry trade groups issued statements within hours warning that such an order would “cede the technological lead to jurisdictions that move faster.”

White House Chief of Staff Susie Wiles attempted to calm the industry response in a brief afternoon statement, saying the administration “does not pick winners and losers” and that any pre-release vetting framework under study is “voluntary in spirit and bounded in scope.” Whether the official order, if issued, will reflect that bounded framing — or Hassett’s broader FDA analogy — remains the open question.

The Voluntary Track Quietly Expands

Separately on Wednesday, the Department of Commerce expanded a pre-existing voluntary pre-release AI testing program to include Google, Microsoft, and xAI. The expansion was announced without ceremony and predates Hassett’s Fox Business appearance, suggesting the administration is pursuing both a soft track (voluntary participation by named labs) and a harder regulatory option in parallel.

Anthropic was not named in the Commerce announcement — a notable omission given that the FDA-style proposal was triggered by concerns about a specific Anthropic model. The juxtaposition reads as an unresolved policy: the administration is simultaneously courting industry cooperation and reserving the option of mandatory review. The May calendar window — before the EU AI Act’s August 2 deadline — will likely force the question.

Dev Conference

Code With Claude — The Conference Inside the Megadeal

Outcomes agentic mode, Dreaming self-correction, and the SMB push: Anthropic uses its annual developer event to unveil the layer above the compute.

Dev Conference

Code with Claude Unveils Outcomes Agentic Mode and Dreaming Self-Correction

At Code with Claude in San Francisco — with subsequent legs scheduled for London and Tokyo — Anthropic debuted Outcomes, a new agentic execution mode that allows developers to define a quality target and run Claude in a loop until the target is met. Outcomes pairs with a companion feature called Dreaming, in which the model produces self-correction passes between iterations rather than only at task completion. The two features formalize the loop-until-good pattern that agent builders have been hand-rolling in prompt scaffolds for months. Virtual attendance was opened globally, a notable break from prior Anthropic events that limited keynote streams to invited press.

SMB

Claude for Small Business Lands With QuickBooks, HubSpot, Canva, DocuSign

Anthropic released Claude for Small Business, a packaged set of connectors and ready-to-run workflows that embed Claude inside QuickBooks, PayPal, HubSpot, Canva, DocuSign, Google Workspace, and Microsoft 365. The product targets mid-market software spend — the same budget bucket that has historically funded incumbent SaaS suites. The Register’s coverage frames the launch in explicit competitive terms, noting Anthropic’s stated intent to replace incumbent software layers with AI-native workflows rather than merely integrate alongside them. The connector list confirms a deliberate breadth: both Google and Microsoft productivity stacks ship on day one.

AWS / MCP

AWS MCP Server Reaches General Availability With IAM Guardrails

AWS announced general availability of the AWS MCP Server, giving AI coding agents — Claude, Codex, Copilot CLI and others — secure, auditable access to every AWS API through a single MCP tool endpoint. The GA build adds three significant capabilities over the preview: sandboxed Python script execution against AWS services (with no local filesystem or shell exposure), IAM-based guardrails enforced on every individual call, and full CloudWatch and CloudTrail logging of all agent actions. The MCP server itself is available at no additional charge; customers pay only for the underlying AWS resources consumed. The pricing posture matches Google’s and Microsoft’s MCP rollouts and effectively concedes the MCP control-plane layer as a non-differentiated commodity.

Open Weights

Meta FAIR Releases Code World Model (CWM) — 32B Open Weights, Trained on Execution Traces

Meta’s FAIR lab released CWM, a 32-billion-parameter dense decoder language model trained not only on static source code but on Python interpreter execution traces and agentic Docker repository interactions — in effect a code model with a model of code execution. CWM scores 65.8% on SWE-bench Verified with test-time scaling, 68.6% on LiveCodeBench, and 96.6% on MATH-500. FAIR is releasing weights at three checkpoints — mid-training, supervised fine-tuning, and RL-tuned — explicitly to support open research on world-model-augmented code generation. The release is one of the more methodologically distinctive open-weights drops of the year and signals that the next axis of code-model competition will be execution-grounded training data rather than parameter count.

Benchmarks & Tooling

Verifying the Computer-Use Era

A calibrated desktop-task benchmark from Kentauros AI, a Microsoft + Browserbase framework for scoring open-ended tasks, and a fresh Copilot CLI release with enterprise-managed plugins.

OSUniverse Benchmark Calibrated So SOTA Agents Top Out at 50%

Kentauros AI published OSUniverse: 160 desktop-environment tasks across 5 complexity levels and 9 categories, calibrated so that state-of-the-art computer-use agents score no higher than 50% — while human office workers complete all tasks perfectly. An automated validator confirms scores with under 2% error rate, eliminating the human-judge bottleneck that has slowed previous computer-use benchmarks. The deliberate score ceiling is the design point: the benchmark is intended to have headroom for several years of model progress before saturating.

Microsoft + Browserbase Publish a Universal Verifier Framework

Microsoft Research and Browserbase jointly published “The Art of Building Verifiers for Computer Use Agents,” a four-principle framework for scoring open-ended desktop tasks: non-overlapping rubric criteria, explicitly separated process and outcome rewards, controllable vs. uncontrollable failure modes, and a divide-and-conquer approach to screenshot context for long-horizon trajectories. The framework complements OSUniverse and is positioned as a reference design that benchmark authors and RL training teams can adopt without re-deriving the principles from scratch.

Copilot CLI v1.0.43 Ships Enterprise Managed Plugins (Preview)

GitHub Copilot CLI v1.0.43 ships today with public-preview support for enterprise-managed plugins: admins can now define a .github-private plugin marketplace from a single settings.json and auto-distribute skills, hooks, and MCP server configurations to every licensed user. The release also adds server-side auto-routing for model selection, patches a remote-code-execution vulnerability in nested bare-repository handling, and ships a small but welcome /statusline username toggle. The enterprise plugin marketplace is the headline: it is the first first-party answer to the home-grown skill-distribution patterns that have proliferated across Copilot, Claude Code, and Codex CLI users.

GitHub Trending

GitHub Trending — Wednesday Snapshot
Repo Language Stars What it does
NousResearch/hermes-agent TS / Python ~130K Self-hosted autonomous AI agent framework — v0.13.0 ships tomorrow with native MCP transport.
mattpocock/skills TypeScript +44.5K (May) Curated registry of Claude Code skills — sharpest May trajectory of any agent-skills repo.
antoinezambelli/forge Python ~91K Reliability framework for self-hosted LLM tool-calling — retries, validation, and structured outputs.
tashfeenahmed/freellmapi TypeScript Rising OpenAI-compatible free-tier proxy aggregating no-cost LLM endpoints behind a single API.
zed-industries/zed Rust ~83K High-performance multiplayer code editor with first-class agent integration on the May roadmap.