Volume 1, No. 51 Thursday, April 23, 2026 Daily Edition

The AI Dispatch

“All the AI News That’s Fit to Compile”


Frontier Model Release

OpenAI Ships GPT-5.5, Its First Fully Retrained Frontier Since GPT-4.5

Six weeks after GPT-5.4, OpenAI rolls out a new base model tuned for minimal-instruction agentic work — claiming 82.7% on Terminal-Bench 2.0 and 84.9% on GDPval.

OpenAI released GPT-5.5 on Thursday, its first fully retrained base model since GPT-4.5 and the most aggressive push yet into what the company describes as “minimal-instruction agentic work.” The model posts 82.7% on Terminal-Bench 2.0 and 84.9% on GDPval, besting Gemini 3.1 Pro and Claude Opus 4.6 on OpenAI’s internal benchmarks for long-horizon, tool-using workloads.

The rollout reaches Plus, Pro, Business, and Enterprise ChatGPT users immediately, with API access coming “very soon,” per the company’s launch note. Arriving only six weeks after GPT-5.4, the release cadence underscores how the frontier race has compressed from quarterly drops to multi-week iterations, with each retraining run burning a fresh slice of compute and capital.

Analysts at Fortune noted that OpenAI pitched the release as the enabling model for a forthcoming ChatGPT “superapp” tier that bundles workspace agents, image generation, and memory into a single consumer surface. MarkTechPost highlighted the Terminal-Bench 2.0 number — a benchmark measuring agent completion of multi-hour terminal tasks — as the clearest signal that OpenAI is prioritizing autonomous coding capability over raw chat eloquence. The timing puts direct pressure on Google, whose Gemini 3 Flash benchmarks from Wednesday’s Cloud Next keynote are now less than 24 hours old.

Security Brief

Unauthorized users still have live access to Anthropic’s Mythos cybersecurity model as of this writing. The company says core systems remain untouched, but the breach exposes cracks in Project Glasswing, its 40-company security consortium.

Project Glasswing

Anthropic Probes Mythos Leak as Discord Group Keeps Access to Cyber Model

A third-party contractor portal let outsiders guess the restricted model’s endpoint — and they’ve had continuous access to a tool designed to discover zero-days across every major OS and browser.

Anthropic confirmed Thursday that it is investigating a breach of Claude Mythos Preview, the restricted-access cybersecurity model designed to discover zero-day vulnerabilities across Windows, macOS, Linux, iOS, Android, Chrome, and Safari. According to Fortune’s reporting, a Discord group located the model’s endpoint by guessing its URL structure based on leaks about Anthropic’s past infrastructure, then entered through a third-party contractor portal that had weaker auth than the core consortium gateway.

The group has reportedly had continuous access since the model launched, and — as of the SC Media update published Thursday morning — still does. Anthropic says no core systems were affected, but TechRadar notes that the breach raises acute questions about Project Glasswing, the exclusive consortium of 40 companies — including Microsoft, Apple, Google, and NVIDIA — that gate Mythos access through a tiered vendor-integration layer.

The episode marks the first publicly reported compromise of a frontier cyber-offense model and lands while governments from Brussels to Washington are debating export controls on dual-use AI. It also reopens the long-running debate inside the AI safety community over whether a closed cybersecurity model — even one built to defend — inevitably becomes an asymmetric weapon if its perimeter is not hermetic. TechCrunch first reported the story on April 21; Thursday’s Fortune follow-up included Anthropic’s official response and CEO Dario Amodei’s internal memo to staff.

Labor & The AI Pivot

The Labor Reckoning

In a single day, Meta cut 8,000 jobs and Microsoft launched the first voluntary retirement program in its 51-year history — while Sam Altman himself questioned how much of this is really about AI.

The Same-Day Story

17,000 Tech Workers Absorb the AI-Capex Tradeoff in One News Cycle

Meta notified roughly 8,000 employees — about 10% of its workforce — of layoffs effective May 20, citing the need to “offset investments” as it scales AI spending toward $135 billion in 2026. Hours later, Microsoft announced its first-ever voluntary retirement buyout, targeting up to 7% of its US workforce (~8,750 employees) who meet an age-plus-tenure threshold of 70 or more. The twin announcements crystallize a pattern analysts have been tracing all quarter: Big Tech is explicitly trading headcount for AI capital, with Meta still leaving 6,000 open roles unfilled and Microsoft offering separation packages rich enough that internal memos describe them as “career graduation” rather than cuts.

Some AI washing where people are blaming AI for layoffs that they would otherwise do anyway. — Sam Altman, OpenAI CEO, on the Q1 layoff narrative

The AI-Washing Debate

Q1 Tech Layoffs Top 78,000 — But Is AI the Real Cause?

Q1 2026 tech layoffs crossed 78,000 workers, with nearly 48% officially attributed to AI automation. But a growing chorus — including Sam Altman — argues companies are using AI as cover for ordinary cost cuts. HR leaders writing in Fortune warn that the framing could backfire by damaging trust and inviting regulatory scrutiny. Snap’s April 15 elimination of 1,000 jobs (16% of its workforce), with AI prominently cited in CEO messaging, has become the leading case study.

Public Sentiment

Stanford AI Index: Gen Z Flips From Excited to Angry in a Single Year

Stanford HAI’s annual AI Index, whose 2026 deep-dive dropped further analysis this week, records the widest expert-public gap in the report’s history: 73% of AI experts say AI will positively affect employment, against just 23% of the American public. Gen Z sentiment moved sharply — those calling themselves “excited” dropped from 36% to 22%, while “angry” climbed from 22% to 31%. US trust in government AI regulation sits at just 31%, the lowest of any nation surveyed.

The Agent Economy

Agents, Chips & Capital

Cognition’s Devin targets a $25B valuation; SpaceX discloses in-house GPU plans; Freshfields commits its 5,700 lawyers to Claude; Google pushes Deep Research Max into MCP territory.

Venture

Cognition Eyes $25B Valuation as Devin Round Jumps 2.5x in Seven Months

Bloomberg reported Thursday that Cognition AI — maker of the Devin autonomous coding agent — is in talks to raise hundreds of millions at a $25 billion valuation, up from $10.2 billion in September 2025. The 2.5x markup in under seven months reflects surging enterprise demand for autonomous SWE agents, with Cognition competing directly against Cursor, OpenAI Codex, and GitHub Copilot Workspace. The round signals where venture capital is now pricing the agentic-coding infrastructure layer heading into mid-2026.

Chip Supply

SpaceX Tells IPO Investors It Plans Its Own GPUs Via TeraFab JV

SpaceX’s draft IPO filing, reviewed by Reuters and surfaced Thursday, says the company intends to manufacture its own GPUs and warns investors of “substantial capital expenditures.” The move ties to TeraFab, a March-announced semiconductor JV between SpaceX, xAI, Tesla, and Intel targeting more than 1 terawatt of annual AI compute capacity at full scale. Elon Musk tells investors that buying every available chipset on the market would still cover only 2% of projected demand.

Enterprise Deal

Freshfields Puts 5,700 Lawyers on Claude; Usage Jumped 500% in Six Weeks

Global law firm Freshfields announced Thursday a multi-year co-development pact with Anthropic, deploying the full Claude suite across all 33 offices and every practice group — 5,700 lawyers in total. Within six weeks of initial rollout, Claude usage inside the firm surged 500%. The deal includes a joint program to build agentic legal workflows and adds Anthropic alongside Freshfields’ existing Google relationship, reinforcing a deliberate multi-vendor approach now typical of the top of the Magic Circle.

Research Agents

Google Debuts Deep Research and Deep Research Max With MCP Support

Google announced next-generation autonomous research agents built on Gemini 3.1 Pro, with native Model Context Protocol support, inline data visualizations, and connectors for custom organizational data sources. Both “Deep Research” and “Deep Research Max” tiers are live in the Gemini API in public preview. The release extends Wednesday’s Cloud Next announcements from raw platform plumbing into a shipped research-agent product.

Open Weights

The Open-Source Frontier

GLM-5.1 claims the top of SWE-Bench Pro under MIT license; DeepSeek V4 inches closer to release; NVIDIA ships Nemotron 3 Nano with a 4x throughput bump.

Coding Benchmarks

Z.ai’s GLM-5.1 Overtakes GPT-5.4 Atop SWE-Bench Pro — Under MIT License

Z.ai (formerly Zhipu AI) released GLM-5.1 on April 7 with weights now broadly available under the MIT license, and the model has solidified its position atop the SWE-Bench Pro leaderboard this week at 58.4 — ahead of GPT-5.4 (57.7) and Claude Opus 4.6 (57.3). Built on a 754B Mixture-of-Experts architecture with 200K context and 128K max output, the model is marketed for long-horizon autonomous coding: it can run a full plan-execute-test-fix loop for up to eight hours unattended. Weights are on Hugging Face and compatible with Claude Code, Cline, and other OpenAI-compatible tooling, making it the most capable MIT-licensed coding model publicly available.

DeepSeek Watch

DeepSeek V4 Window Narrows to Late April as V4-Lite Nodes Appear

Open-source community consensus has converged on “late April” as the likely launch window for DeepSeek V4, with V4-Lite inference nodes reportedly spotted in the wild this week. Expectations center on a roughly 1T-parameter MoE (37B active per token), a 1M-token context window, native multimodal generation, and Apache 2.0 weights — matching V3’s permissive terms. The chief bottleneck remains availability of Huawei Ascend 950PR chips, now central to the firm’s pre-training stack.

NVIDIA Open Models

Nemotron 3 Nano Lands on Hugging Face — 4x Throughput Gain

NVIDIA’s Nemotron 3 Nano (30B/3B active, hybrid MoE) is live on Hugging Face at nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16, delivering 4x higher tokens-per-second than Nemotron 2 Nano for multi-agent workloads. Nemotron 3 Super (120B/12B active, 1M context) shipped in mid-March; Ultra is due H1 2026. Distributed via Perplexity, OpenRouter, Hugging Face, and build.nvidia.com.

Ecosystem

Hugging Face Transformers Ships Gemma 4 + Qwen3.6 Integration

The transformers release cadence this week added support for Google’s Gemma 4 family (E2B, E4B, 26B MoE, 31B Dense — Apache 2.0) and Alibaba’s Qwen3.6-35B-A3B. Gemma 4 ships with native vision/audio and 256K context; Qwen3.6-A3B claims 73.4% on SWE-Bench Verified while running on consumer GPUs. Both are downloadable today via transformers and Ollama.

From the Papers

ICLR 2026 & The Research Wire

Two Outstanding Paper awards dropped at ICLR 2026, a neuro-symbolic robotics paper slashes energy use 100x, and MLPerf Inference v6.0 rewrites the industry’s benchmark playbook.

ICLR Outstanding Paper

“LLMs Get Lost in Multi-Turn Conversation” Takes ICLR 2026 Top Honor

Announced April 23 at ICLR 2026 in Singapore, the Salesforce AI Research paper exposes a gap between how LLMs are evaluated (single-turn) and how they are deployed (multi-turn). The authors introduce “sharded simulation,” splitting single-turn benchmarks into sequences of smaller underspecified turns. Every tested model — Llama 3.1-8B through Gemini 2.5 Pro — drops in reliability as early as turn two. Program chairs cited the experimental design and real-world relevance as the deciding factors.

ICLR Outstanding Paper

“Transformers Are Inherently Succinct” Gives Architecture a Formal Proof

The second Outstanding Paper announced April 23 (ETH Zürich / University of Kaiserslautern-Landau) proves via circuit complexity and formal language theory that Transformers are more “succinct” than recurrent models — encoding certain concepts with exponentially fewer parameters. After eight years of empirical scaling arguments, the community finally has a rigorous theoretical footing for an assumption it has largely operated on by intuition.

Robotics + Energy

Tufts Neuro-Symbolic Robots Use 1% of the Energy of VLA Models

Circulating ahead of its ICRA 2026 (Vienna, May) presentation, “The Price Is Not Right” benchmarks neuro-symbolic methods against Vision-Language-Action models on structured long-horizon manipulation. Training used just 1% of the energy of a comparable VLA; inference used only 5% — and the neuro-symbolic system was more accurate. The paper lands as AI power draw becomes a defining concern for regulators and hyperscalers alike.

MLPerf v6.0

MLPerf Inference v6.0 Adds GPT-OSS 120B, Video, and DLRMv3

Still reverberating through hardware reviews this week, MLPerf Inference v6.0 overhauls five of eleven datacenter tests: a GPT-OSS 120B open-weight benchmark, expanded DeepSeek-R1 reasoning with interactive speculative decoding, DLRMv3 (the suite’s first sequential recommender), text-to-video workloads, and YOLOv11 for edge. NVIDIA Blackwell Ultra topped throughput across the suite; HPE submitted 862 results and claimed 18 category leads — the largest MLPerf submission on record.

Policy & Courts

Policy, Preemption & The Courts

Disney’s discovery fight with Midjourney heats up; the White House’s preemption push meets 134 active state AI bills; Ed Department ties grant dollars to AI literacy.

Copyright

Disney v. Midjourney Heads to Discovery Showdown April 27

The Disney/Universal v. Midjourney case — Hollywood’s first major copyright action against an AI image generator — entered a contentious discovery phase. A motion to compel additional document production is set for April 27, with Midjourney pushing an “unclean hands” defense noting the studios themselves deploy generative AI. The case is widely watched as a potential bellwether for whether training constitutes infringement.

Federalism

White House Preemption Push Meets 134 State AI Bills in 31 States

The White House’s March 2026 National AI Policy Framework — which recommends Congress preempt state AI laws that impose “undue burdens” — is hitting escalating friction as legislatures in 31 states push ahead. Colorado’s AI Act, New York’s RAISE Act (in force since March 19), and Ohio’s July 2026 school-AI policy deadline all represent direct state action the framework would override. Congressional showdown expected.

Education

Ed Department Finalizes Grant Rule Weighting AI Literacy for K-12

The US Department of Education finalized on April 13 a rule that formally prioritizes grant applications integrating AI literacy and ethical AI use into K-12 teaching — the first federal rule directly shaping how AI appears in classroom curricula. It arrives as 31 states move their own AI-in-education bills; Ohio requires districts adopt AI policies by July 1, Idaho mandates statewide K-12 AI literacy standards.

Quick Hits

Briefs

IP Fabric Launches Enterprise Network MCP Server

First MCP server purpose-built for regulated network ops — natural-language queries against a network digital twin with CIS, NIST, PCI-DSS, HIPAA, and NIS2 prompt libraries, and opt-in governance by default.

Intel Shares Surge 16% After Q1 AI-CPU Beat

Intel reported Q1 2026 earnings Thursday that beat analyst expectations, sending the stock up more than 16% after-hours. Management cited AI-capable CPU demand as the primary growth driver, pitching Intel as the alternative compute supply chain to NVIDIA amid ongoing chip scarcity.

AI Drug-Discovery Cluster: OpenAI, AWS, Novo Nordisk All Move

OpenAI’s GPT-Rosalind biotech reasoning model went live for Enterprise research preview; AWS launched Amazon Bio Discovery for no-code computational biology workflows; and Novo Nordisk formalized an OpenAI partnership to compress research-to-patent timelines. Q1 digital health funding hit $7.4B.

AI Infrastructure Funding Cluster: Omni, Aaru, Orkes Close $260M

Omni raised $120M Series C at a $1.5B valuation (ICONIQ-led) for AI analytics; Aaru closed $80M Series A for synthetic consumer/political research; and Orkes banked $60M Series B for agentic workflow orchestration. The deals reflect a capital shift toward infrastructure and reliability tooling.

Post-Training Unification: One Paper to Rule RLHF, DPO, GRPO

An arXiv synthesis (2604.07941) posted this week attempts to unify the fragmenting landscape of LLM post-training — RLHF, DPO, GRPO, PPO — under a single off-policy/on-policy framework, offering a taxonomy and predictions for when each approach should win. Timely as post-training becomes the primary frontier differentiator.

Interpretability Tool Arrives for AI Weather Models

A preprint posted to arXiv April 22 (2604.20467) applies cosine similarity, PCA, and direction-finding to AI numerical weather models — with an open-source toolkit. An early case of LLM-era interpretability methods maturing into scientific foundation models.

Sparse Autoencoders Probe Code-Correctness Circuits

A new arXiv preprint applies Anthropic-style SAEs to decompose LLM internal representations of code, locating the circuits responsible for a model’s sense of whether its own code is correct — claimed as the first SAE application to superposition in code representations.

GitHub Trending

Today’s Most-Starred Repositories
Repo Language Stars Today What it does
Alishahryar1/free-claude-code Python +1,962 Terminal and VSCode proxy giving free-tier access to Claude Code.
elder-plinius/CL4R1T4S +1,434 Crowdsourced archive of leaked system prompts from ChatGPT, Claude, and others.
Z4nzu/hackingtool Python +1,383 All-in-one ethical-hacking toolkit bundling hundreds of pentest utilities.
zilliztech/claude-context TypeScript +1,011 Code-search MCP that turns an entire codebase into the live context for any agent.
jamiepine/voicebox TypeScript +~800 Open-source voice synthesis studio built on Qwen3-TTS with cloning and transcription.
open-metadata/OpenMetadata TypeScript +776 Unified metadata platform for data discovery, lineage, and governance.
HKUDS/RAG-Anything Python +590 All-in-one RAG framework with multimodal document support (PDF, audio, video).
ruvnet/RuView Rust +429 WiFi-based DensePose system: real-time pose and vital-sign detection without cameras.
Toolbox

Today in AI Coding Tools: Double Drops from Claude Code and Codex CLI

Claude Code v2.1.118 & v2.1.119

Two stable releases landed in the last 48 hours:

  • /config persists to ~/.claude/settings.json with project/local/policy precedence
  • --from-pr now accepts GitLab, Bitbucket, and GitHub Enterprise URLs
  • PostToolUse hooks include duration_ms timing
  • Vim visual mode (v/V) with selection operators
  • /cost + /stats merged into unified /usage; custom themes via /theme

Codex CLI v0.123.0 & v0.124.0

Two stable releases the same day (April 23):

  • Built-in Amazon Bedrock provider with AWS SigV4 signing
  • /mcp verbose for full MCP server diagnostics
  • TUI reasoning shortcuts: Alt+, and Alt+.
  • Stable hooks configurable via config.toml
  • Remote plugin marketplace live; gpt-5.4 set as default model

GitHub Copilot CLI v1.0.35

A quality-of-life drop:

  • Tab-completion for slash-command arguments and subcommands
  • ! shell escapes use $SHELL instead of hardcoded /bin/sh
  • Named sessions via --name and --resume=
  • Session selector displays branch names with improved search
  • Permission prompts in remote sessions; multiple MCP OAuth and LSP fixes

Cursor, Windsurf, and Gemini CLI were quiet in the strict 48-hour window; last Cursor entry was April 15 (interactive canvases).