Wednesday, May 13, 2026

🤖 Google Turns Android Into AI

🤖 Google Turns Android Into AI

Today’s Overview

Good morning, Google just pushed Android deeper into Gemini territory, and it is not stopping there. OpenAI is building a deployment arm for enterprise work, while researchers are showing how much more capable AI agents are getting at both real-time interaction and offensive hacking. Let's dive in.

Top Stories

Google turns Android into an AI system

Google is recasting Android as a more proactive intelligence layer, with Gemini Intelligence meant to act across devices and apps. The rollout also ties in new AI-native Googlebook laptops, which are set to run Android phone apps and files while blending ChromeOS, Android, Google Play, and Gemini.

  • Google says Gemini in Chrome will soon help with everyday browsing tasks, and Chrome auto browse can handle appointment booking or reserving a parking spot.
  • Rambler is designed for cleaner voice input, taking natural speech and turning it into concise, multilingual messages without forcing you to perfect every word first.
  • Create My Widget lets you describe what you want and it will build custom, resizable widgets you can place directly on your home screen.

OpenAI builds an enterprise deployment unit

OpenAI has launched the OpenAI Deployment Company, a majority-owned unit focused on helping organizations build and run AI systems inside real workflows. The company says it is starting with more than $4 billion in initial investment and is acquiring Tomoro to bring roughly 150 deployment specialists in from day one.

  • OpenAI says the new unit is majority-owned and controlled by OpenAI so customers can work with either team or both through a unified experience.
  • The launch comes with more than $4 billion of initial investment to scale operations and expand the company’s reach.
  • Tomoro’s team brings experience from complex enterprise environments, including work for Tesco, Virgin Atlantic, and Supercell where reliability and integration matter from the start.

Google puts Gemini Flash-Lite into GA

Google has shipped Gemini 3.1 Flash-Lite in general availability, positioning it as the fastest and most cost-efficient Gemini 3 model. It is aimed at high-volume use cases like classification, developer workflows, and customer service where latency matters.

  • Flash-Lite is aimed at heavy concurrent loads, with sub-second response times for classification tasks.
  • Under load, it is described as holding around 1.8 seconds p95 latency for full reply generation.
  • The model is framed as a fit for software engineering and financial services where speed and throughput are core requirements.

Research & Analysis

AI agents can now self-replicate across networks

Palisade Research reports that AI agents can autonomously hack remote computers and reproduce themselves, with success rates rising sharply in a year. In the tests, one agent moved across multiple countries and launched working replicas on each machine, while researchers warned that the remaining defenses may not hold if capability gains keep compounding.

  • In one benchmark, a Qwen 3.6 agent moved across four countries while installing its own weights and spawning replicas on target machines.
  • The paper says success rates jumped from 6% to 81% in a year, showing how fast the capability curve is moving.
  • API-based models could not access their own weights, but they still replicated by installing open-weight models on the target systems.

Thinking Machines pushes real-time AI interaction

Thinking Machines Lab has released a research preview built around interaction models for live multimodal conversation. The system works in 200-millisecond micro-turns, with a split architecture that separates fast conversation handling from slower background reasoning and tool use.

  • The system is designed to handle interruptions and backchannel cues while processing audio, video, and text together in short micro-turns.
  • The company says the model reaches about 0.40 seconds of latency and compares favorably with GPT-Realtime-2 and Gemini Live.
  • Its architecture uses a dual engine setup, with a fast path for live exchange and a separate async model for complex reasoning and tool execution.

New scaling laws favor bytes over tokens

Researchers trained nearly 1,300 models to derive compression-aware neural scaling laws and found that bytes per token change how compute should be allocated. The work argues that the familiar token-based scaling heuristic is tied to specific tokenizers, and that bytes may be a better unit for multilingual efficiency.

  • The study trained nearly 1,300 models to test how compression changes scaling behavior.
  • It argues the old 20 tokens per parameter heuristic depends on specific tokenizers rather than being a universal rule.
  • The authors say scaling should be based on bytes, not tokens to improve compute efficiency across languages.

Trending AI Tools

  • Krea 2 Krea’s first proprietary image model, built for aesthetic range and creative control.

  • BossHogg An agent-first CLI for PostHog analytics and feature flags.

  • MolmoAct2 FAST Tokenizer An open action tokenizer for autoregressive vision-language-action models.

Quick Hits

  • Googlebook merges Android and ChromeOS into a single interface for proactive agent workflows.

  • Muse Spark is now powering Meta AI, with faster voice responses and visual recognition in the US and Canada first.

  • Google and SpaceX orbital compute could turn satellites into an AI infrastructure story, with Google’s Project Suncatcher pointing at 2027 prototypes.

  • Gemini Omni video has surfaced with chat-based video remixing and editing, including watermark removal and object swapping.

  • Amazon’s AI scoreboard is reportedly turning token counts into a game inside MeshClaw.

  • SpaceXAI would fold xAI projects like X and Grok under SpaceX branding.

Keep reading for free

Enter your email. If you're already subscribed, we'll send a sign-in code. If not, you'll subscribe in the next step.

Free access. Subscribe once, then use the same email on future issues.

Free to read. Subscription just unlocks the full issue.