Friday, April 17, 2026

OpenAI Recasts Codex As Agent

OpenAI Recasts Codex As Agent

Today’s Overview

Good morning, OpenAI and Perplexity are both pushing agents deeper into real computers, while the research side is asking a harder question: can these systems actually do science? There’s also fresh movement around AI tools and some notable industry tension in the quick hits. Let’s dive in.

Top Stories

OpenAI Turns Codex Into a Background Agent

OpenAI is recasting Codex as an always-on agent that can control the cursor and work across macOS and Windows apps. The update also adds an in-app browser, more plugins, and support for longer-running automations with persistent workflow memory. OpenAI paired that with GPT-Rosalind for life sciences and a new Agents SDK built for sandboxed, secure task execution.

  • Codex now acts as an always-on agent that can control the cursor and interact with desktop apps.
  • The app adds an in-app browser and more than 100 plugins to support parallel tasks and long-running automations.
  • OpenAI also introduced GPT-Rosalind and a new Agents SDK for life sciences workflows and secure task execution.

Perplexity Brings Agents to Your Desktop

Perplexity has launched Personal Computer, extending its agentic stack to local machines. The system can work across files, native apps, and the web to handle repetitive workflows instead of just reading information. It is currently limited to Perplexity Max subscribers and runs in a sandbox with auditable, reversible actions.

  • Personal Computer can work across local files, native apps, and the web to carry out tasks on a desktop machine.
  • Perplexity is targeting messy, repetitive workflows like reorganizing downloads or comparing documents against live web data.
  • The rollout is limited to Perplexity Max subscribers and uses a secure sandbox with auditable, reversible actions.

Research & Analysis

Allen AI Benchmarks Test Science Agents

Allen Institute researchers argue that evidence for agentic scientific discovery is still weak. To pressure-test the claims, they introduced ScienceWorld and DiscoveryWorld, two open benchmarks meant to measure whether agents can reproduce classic discoveries or make new ones at more advanced levels. The goal is to move the conversation from hype to measurable capability.

  • The researchers say the evidence for agentic scientific discovery is still weak.
  • ScienceWorld is meant to test whether agents can re-make classic discoveries at roughly an elementary school level.
  • DiscoveryWorld pushes harder, asking whether agents can make new discoveries at college or PhD level.

Trending AI Tools

  • HY-World 2.0 Tencent’s Hunyuan team open-sourced a world model for editable 3D scenes with physics-aware movement.

  • DataGrout AI An enterprise AI platform focused on agentic AI and MCP integration.

  • Astra A tool for creating AI agents that never see your data.

Quick Hits

Keep reading for free

Enter your email. If you're already subscribed, we'll send a sign-in code. If not, you'll subscribe in the next step.

Free access. Subscribe once, then use the same email on future issues.

Free to read. Subscription just unlocks the full issue.