Wednesday, May 20, 2026

Gemini 3.5 Flash Takes Center Stage

Gemini 3.5 Flash Takes Center Stage

Today’s Overview

Good morning, Gemini just got a lot busier. Google is pushing 3.5 Flash across Search, Android Studio, and enterprise tools, while also showing off a bigger agent strategy built around Gemini, smart glasses, and always-on assistants. Let’s dive in.

Top Stories

Google launches Gemini 3.5 Flash

Google introduced Gemini 3.5 Flash as the first model in the 3.5 family, with a focus on agents, coding, and long-horizon tasks. The company is also rolling it out across consumer, developer, and enterprise products, making this more than a model announcement.

  • Google says 3.5 Flash is its strongest agentic and coding model yet and it is built to handle complex long-horizon work.
  • The model is already available through Gemini app, Search, and developer tools including Google Antigravity, Gemini API, and Android Studio.
  • Google says the model reaches 4x the output tokens per second of other frontier models.

Google turns Gemini into its agent layer

At I/O, Google showed a broader Gemini push that stretches from a new Omni model to Search redesigns and a 24/7 personal agent. The message is clear: Gemini is being positioned as the action layer across Google’s products, not just a chatbot. The company also teased related tools and features like Science, Intelligent Eyewear, and SynthID watermarking.

  • Gemini Omni can take text, images, audio, or video as inputs and turn them into video outputs.
  • Gemini Spark is designed as a 24/7 personal agent running on Google Cloud virtual machines.
  • Google says Search’s upgrade is its biggest redesign in a generation with cross-modal inputs and generative UI.

Andrej Karpathy joins Anthropic

Andrej Karpathy says he has joined Anthropic and will work on automating the AI training pipeline with Claude. The move brings one of the field’s most visible researchers into Anthropic’s pre-training effort. Karpathy also said he plans to return to his education work later on.

  • Karpathy is joining the pre-training team under Nick Joseph while also leading the internal effort around Claude in Anthropic’s pipeline.
  • His background includes co-founding OpenAI in 2015 and leading Tesla’s Autopilot until 2022.
  • He left OpenAI again in 2024 to start an AI education startup and says he plans to resume that work in time.

Research & Analysis

METR maps frontier agent risk

METR’s first Frontier Risk Report adds a useful line between what today’s agents can do and where they still break down. The headline finding is that top-lab agents can handle long engineering stretches, but brittle verification remains a major weakness. That split matters for anyone thinking about deployment, oversight, or automation thresholds.

  • The report says current agents can complete multi-week engineering work without constant human intervention in some cases.
  • A key limitation is hard-to-verify tasks where agents still struggle to prove their own output is correct.
  • The findings help frame where autonomy is real and where it is still too fragile for trust.

OlmoEarth trims compute by 3x

OlmoEarth v1.1 is pitched as a more efficient model family for planet-scale remote sensing. The release keeps performance close to the original version while cutting compute costs sharply, which makes large mapping workloads more practical. The gains come from methodological changes that reduce how much work the model has to do on each token sequence.

  • The models are tuned for remote sensing data rather than generic language tasks.
  • The efficiency gains come from optimizing token sequence lengths which lowers computational cost during processing.
  • The release is meant to make planet-scale mapping more affordable for developers and scientific research.

Why language models keep mode-hopping

Researchers describe a pre-training pattern where models swing between imitation and more adaptive behavior. They call it mode-hopping, and they argue it reflects a competition for capacity across training windows rather than something ordinary optimization fixes. The work points to practical uses in checkpoint selection, data curation, and predicting model behavior.

  • The researchers argue the effect is driven by competition for model capacity across different training windows.
  • They say standard optimization does not correct the behavior once the switching pattern appears.
  • The work suggests mode-hopping could help with checkpoint selection and data curation during pre-training.

Trending AI Tools

  • Starchild-1 and Agora-1 Odyssey’s world-model demos span real-time multimodal generation and multiplayer AI-generated worlds.

  • SynthID for ChatGPT images OpenAI is adding Google’s watermarking scheme plus a public verification tool for image provenance.

  • Insights by Omnia Generates step-by-step action plans aimed at improving AI visibility.

Quick Hits

Keep reading for free

Enter your email. If you're already subscribed, we'll send a sign-in code. If not, you'll subscribe in the next step.

Free access. Subscribe once, then use the same email on future issues.

Free to read. Subscription just unlocks the full issue.