Wednesday, April 8, 2026

Anthropic’s Secret Model Stuns Everyone

Anthropic’s Secret Model Stuns Everyone

Today’s Overview

Good morning, Claude Mythos is the kind of model story that makes the rest of the week feel smaller, with Anthropic keeping a powerful system under wraps and using it for vulnerability research instead. There’s also fresh movement inside OpenAI’s leadership, plus a batch of new research and tools worth a look. Let’s dive in.

Top Stories

Anthropic keeps Claude Mythos private

Anthropic says Claude Mythos is powerful enough that it chose not to release it publicly. Instead, the company launched Project Glasswing so a small set of organizations can use the model for vulnerability research.

  • Anthropic says Mythos reached 93.9% on SWE-bench Verified and 77.8% on SWE-bench Pro.
  • The model is said to have found an OpenBSD vulnerability hidden for 27 years along with a 16-year-old FFmpeg flaw.
  • Project Glasswing gives 12 organizations access to hunt flaws in their own systems using $100 million in compute credits.

OpenAI reshuffles leadership amid health leaves

OpenAI is shifting responsibilities as Fidji Simo and Kate Rouch step away for medical recovery. Greg Brockman, Brad Lightcap, Denise Dresser, Gary Briggs, Jason Kwon, and Sarah Friar are taking on expanded duties across product, commercial, and operational work.

  • Fidji Simo is on medical leave for a neuroimmune condition and Greg Brockman is stepping in on product and the super app roadmap.
  • Brad Lightcap is moving to special projects while Denise Dresser takes over most commercial duties.
  • A leaked cap table put employees at 16% equity worth $135 billion ahead of talk of a possible $1 trillion IPO.

Research & Analysis

GLM-5.1 pushes longer-horizon agent work

Z.ai’s GLM-5.1 is positioned as a flagship model for agentic engineering built for long tasks. The company says it delivers state-of-the-art performance on SWE-Bench Pro and can stay effective across hundreds of rounds and thousands of tool calls.

  • Z.ai says the model hits state-of-the-art SWE-Bench Pro performance for agentic engineering work.
  • It is built to keep going across hundreds of rounds without losing effectiveness.
  • The system is designed for thousands of tool calls while breaking down problems, running experiments, and reading results.

Meta AI makes RL cheaper for MLE agents

Meta AI’s SandMLE is a framework for building small but structurally realistic machine-learning engineering environments. It makes on-policy RL practical for ML engineering agents by cutting execution cost by more than 13x.

  • SandMLE builds small but realistic MLE environments for agent training.
  • That setup makes on-policy RL practical for ML engineering agents.
  • The framework cuts execution cost by more than 13x compared with the prior setup.

Cursor redesigns MoE inference with warp decode

Cursor’s warp decode is a kernel design for MoE inference that reorganizes computation around output neurons instead of experts. The company says it delivers about 1.8x higher throughput and improved numerical accuracy on Blackwell GPUs.

  • Warp decode reorganizes inference around output neurons instead of experts.
  • Cursor says the design delivers about 1.8x higher throughput on Blackwell GPUs.
  • The company also says it improves numerical accuracy in the process.

Trending AI Tools

  • TorchTPU A stack for running PyTorch natively on TPUs at Google scale.

  • VibeSonic A private AI voice toolkit for dictation and voice workflows.

Quick Hits

Keep reading for free

Enter your email. If you're already subscribed, we'll send a sign-in code. If not, you'll subscribe in the next step.

Free access. Subscribe once, then use the same email on future issues.

Free to read. Subscription just unlocks the full issue.