Google just collapsed two of its most popular Gemini features into a single, continuous experience. Gemini Live, the real-time voice and camera assistant, can now generate and edit images on the fly while you are pointing your phone at the world and talking. No switching apps, no breaking the conversation to type a prompt, no pasting screenshots back and forth.

The flow is intentionally simple: open the Gemini app, tap the Live button, share your camera, and describe what you want to see. The model then produces or modifies an image inside the same session you are already speaking in. Google is pitching three early use cases for the rollout: visualizing room decor changes, working through math problems, and spinning up shareable memes.

What is actually new here

Image generation and Gemini Live have both existed for a while, but as separate surfaces. Live started as audio only and then gained camera and screen sharing, letting you talk with Gemini about whatever is in front of you. In less than a year, Gemini Live transformed from an audio only interface to a truly multimodal and dynamic conversational experience where you can discuss images, files, and YouTube videos, with continuous video sharing as the next frontier.

Meanwhile, image creation moved into the main Gemini app through Nano Banana. Nano Banana is the name for Gemini's native image generation capabilities, letting the model generate and process images conversationally with text, images, video, or a combination, so you can create, edit, and iterate on visuals with unprecedented control. What changed is that this image pipeline now runs inside the live, streaming voice channel.

Alpha Signal

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

  • Full access to in-depth AI research breakdowns
  • Be the first to know what's trending before it hits mainstream
  • Daily curated papers, repos, and industry moves