The AI Dispatch — April 3, 2026

Security

Mercor Confirms 4TB Data Breach via LiteLLM Supply Chain Attack; Lapsus$ Claims Credit

A rogue PyPI package lurked for just 40 minutes — long enough to compromise an estimated 36% of cloud environments and freeze Meta’s AI data operations.

Mercor, the AI-era talent platform that recruits expert data labelers for Anthropic, OpenAI, and Meta and carries a $10 billion valuation, confirmed a 4-terabyte data breach late Thursday. The notorious extortion group Lapsus$ claimed responsibility, asserting they planted malicious code inside LiteLLM packages 1.82.7 and 1.82.8, which were live on the Python Package Index for approximately 40 minutes before being pulled.

Security researchers estimate that the brief PyPI window was nonetheless sufficient to affect roughly 36% of cloud environments running automated dependency installation — a figure that reflects how deeply LiteLLM is embedded in the modern AI development stack as a unified gateway for calling multiple LLM APIs. The compromised packages included a dependency-exfiltration payload targeting API keys, model output logs, and contractor metadata.

The blast radius extends beyond Mercor itself. Meta reportedly froze its AI data labeling work pending a full audit of affected pipelines, while downstream customers were advised to rotate all API keys provisioned through LiteLLM in the affected version window. Mercor stated it has notified affected individuals and is cooperating with federal law enforcement.

The incident reignites long-standing concerns about AI supply chain hygiene. LiteLLM’s position as a near-ubiquitous abstraction layer means that a single compromised release version becomes an extraordinarily high-leverage attack surface — one that threat actors like Lapsus$, who specialize in social engineering and insider-access attacks, are clearly aware of.

Sources: TechCrunch • The Record • SecurityWeek • Fortune

Policy

White House AI Legislative Framework Triggers State-Level Activity

Following Washington’s light-touch federal framework, Georgia became one of the first states to act — passing chatbot disclosure, healthcare guardrails, and an AI study mandate to the governor’s desk.

The March 26 National Policy Framework for Artificial Intelligence — a set of 27 recommendations favoring voluntary industry standards and federal pre-emption of state rules — has begun generating exactly the kind of state-level legislative urgency its critics predicted. Georgia sent three AI bills to Governor Brian Kemp’s desk in rapid succession this week, becoming one of the most active state legislatures on AI issues in the country.

SB 540 mandates disclosure when users interact with AI chatbots and adds enhanced safeguards for minors, targeting apps like character.ai and social platforms with AI-driven features. SR 789 establishes a formal joint study committee charged with delivering comprehensive AI policy recommendations by December 2026. SB 444 — perhaps the most consequential of the three — prohibits insurance companies from making coverage or claims decisions based solely on AI outputs, requiring a human decision-maker in any denial or adverse action.

The Georgia bills reflect a broader pattern: even as the White House argues that patchwork state laws could fragment the national AI market, states are proceeding with targeted interventions in the domains where federal inaction feels most acute — consumer protection, healthcare, and child safety. Legal observers at Nixon Peabody and Ropes & Gray note that SB 444’s healthcare provision may face preemption challenges under federal insurance statutes, but its enactment would nonetheless signal a template other states are likely to follow.

Sources: Transparency Coalition • Nixon Peabody • Ropes & Gray

Developer Tools

Cursor 3 Launches Agent-First Interface

Cursor’s third major release redesigns the entire interface around the idea that AI agents, not human developers, are the primary unit of code authorship. Instead of accepting or rejecting autocomplete suggestions, developers now assign tasks to agents and review their outputs — a fundamental inversion of the copilot model.

The shift is more than cosmetic. Cursor 3 introduces a task-queue system, persistent agent context across sessions, and structured handoff protocols between multiple agents working in parallel. Early adopters report dramatically higher leverage on boilerplate and integration work, though the new paradigm demands a different kind of attention from developers: less keystroke-level, more review-and-redirect.

Source: devFlokers

Open Models

Gemma 4 Community Reception: 31B Ranks #3 on Arena Leaderboard

Two days after Google’s launch, the Gemma 4 31B Dense model cemented third place among open models on the Arena AI leaderboard — a community benchmark driven by direct human preference comparisons. The Apache 2.0 license is driving adoption on Hugging Face and Ollama at a pace that rivals the reception of Llama 3 at launch.

Community benchmarks confirm that Gemma 4’s multimodal capabilities are competitive with closed alternatives at the same parameter tier, with particular strength in document understanding and multilingual tasks. The combination of open weights, commercial-friendly licensing, and leaderboard performance is setting a new baseline expectation for what an open Google model can deliver.

Sources: Hugging Face • Dataconomy

AI Safety & Leaks

Claude Code Leak Fallout: 44 Unreleased Feature Flags Discovered

Researchers analyzing the 512,000-line Claude Code source dump — accidentally exposed via the npm registry on March 31 — have catalogued 1,900 files including 44 feature flags that point to unreleased capabilities. The flags hint at enhanced reasoning pathways, structured memory systems, and changes to safety architecture that have not appeared in any public release notes.

Anthropic confirmed the leak stemmed from human error rather than a security breach, and removed the affected package within hours. The disclosure has nonetheless sparked debate about responsible development practices at frontier AI labs, given that the source contained enough architectural detail for competitor analysis. VentureBeat reports that several AI startups have already begun competitive reviews of the exposed code.

Sources: The Hacker News • VentureBeat • Gizmodo

Research

MIT Releases AI Fairness Testing Framework

MIT researchers published an open-source framework for systematically identifying cases where AI decision-support systems produce disparate outcomes across demographic groups. The toolkit targets high-stakes deployment domains — hiring, lending, and healthcare — where audit requirements are increasing but standardized methodologies have remained elusive.

Unlike prior fairness tools that focus on a single metric, the MIT framework runs a battery of intersectional tests across protected attribute combinations, surfaces the most impactful disparities, and generates structured reports suitable for regulatory submissions. Early adopters from two major health systems and a large financial institution participated in the validation study.

Sources: MIT News • BIA Roundup

Toolbox

Claude Code v2.1.89 NO_FLICKER Engine & Copilot CLI /fleet

Claude Code v2.1.89 (April 2)

Anthropic shipped Claude Code v2.1.89 with a headline feature: the NO_FLICKER rendering mode, activated via the environment variable CLAUDE_CODE_NO_FLICKER=1. The mode resolves longstanding scroll instability in terminals with high-frequency output — a pain point for users running large agentic tasks in tmux and iTerm2 — and simultaneously enables proper mouse event forwarding, allowing click-to-navigate interactions inside the TUI. Secondary fixes include improved multi-pane session handling and a more reliable hook execution order when sub-agents write back to the orchestrator context.

GitHub Copilot CLI — /fleet (April 3)

GitHub’s Copilot CLI shipped /fleet, a parallel multi-agent orchestration command that decomposes a high-level task description into concurrent sub-agent assignments. The orchestrator maintains a shared task ledger and merges outputs once all sub-agents report completion. GitHub positions /fleet as a direct counterpart to Claude Code’s multi-agent capabilities, marking the first time the two leading agentic coding tools have shipped parallel orchestration within 48 hours of each other — a sign of how rapidly this pattern is becoming table stakes.

Sources: jls42 blog • GitHub Changelog

In Brief

GPT-5.4 “Thinking” Hits 83% on GDPVal Benchmark

OpenAI’s GPT-5.4 “Thinking” variant posted an 83% score on the GDPVal benchmark — at or above human expert level on the economic policy reasoning suite. The result adds to a growing body of evidence that frontier reasoning models are exceeding domain-expert baselines on structured analytical tasks. Source

Anthropic Mythos Leak Analysis Continues

Coverage of the internal Anthropic “Mythos” document leak deepened this week, with multiple researchers publishing breakdowns of a model internally described as “the most capable we’ve built to date.” The document’s framing of long-horizon reasoning and multi-step tool use suggests capabilities not yet present in Claude 3.5 Sonnet’s public API. Source

Prompt Injection Vulnerabilities Surface in OpenClaw-Type Frameworks

MarketingProfs’ AI weekly digest flagged new research demonstrating prompt injection vulnerabilities in open-source agentic frameworks that follow the OpenClaw architecture pattern, where tool-call results are fed back into the context window without sanitization. The attack vector allows adversarial tool outputs to redirect agent behavior. Source

GitHub Trending

Repo	Language	Stars	Description
ultraworkers/claw-code	Rust	148.2K	Fastest repo to 100K stars; terminal AI coding agent
siddharthvaddem/openscreen	TypeScript	17.2K	Open-source no-watermark Screen Studio alternative
block/goose	Rust	—	Extensible open-source AI agent
Kuberwastaken/claurst	Rust	—	Claude Code reimplemented in Rust from leaked source
n8n-io/n8n	TypeScript	182K	Fair-code workflow automation platform
Significant-Gravitas/AutoGPT	Python	183K	Autonomous AI agent platform

Source: GitHub Trending • Star counts as of April 3, 2026