OpenAI launches GPT-5.4 with AI agents that surpass human performance on operating-system tasks
OpenAI introduced the GPT-5.4 family, enabling AI agents to control desktop environments through mouse and keyboard actions. In independent testing on the OSWorld-Verified benchmark, the models achieved a 75% success rate, exceeding the average human score of 72.4%. The Pro version also set new records on the FrontierMath benchmark and topped the Short-Story Creative Writing Benchmark. A new Tool Search feature dynamically retrieves API definitions, cutting token usage by about 47% for complex workflows. The rollout is planned for the second quarter of 2026 after a brief prototype phase.
Read Full Article