Most frontier AI labs have kept vision and coding separate: you pick a vision model for image tasks, a coding model for agents, and you stitch them together yourself. Qwen3.7-Plus breaks that pattern. Alibaba's Qwen team released Qwen3.7-Plus, a multimodal agent model that combines visual perception, graphical user interface control, and code generation within a single autonomous agent loop. The result is a model that can look at your screen, decide what to click, write the code to act on it, run that code, check the output, and loop until the task is done , all without switching models or pipelines.

One model, two worlds

Unlike most frontier models that separate vision and agent capabilities into distinct offerings, Qwen3.7-Plus is designed to perceive, reason, code, and act across both GUI and CLI environments simultaneously. That distinction matters in practice. Alibaba built Qwen3.7-Plus on top of the Qwen3.7-Max foundation and added what the language-only Max doesn't have: eyes. The model can read your screen, understand what's on it, navigate graphical interfaces, automate browsers, and operate desktop applications from screenshots.

It can understand visual interfaces, perceive on-screen content, and perform both GUI interactions and CLI operations, while also leveraging environmental feedback for code generation, application manipulation, testing, validation, and iterative optimization. By integrating the full workflow of "see, think, write, act, and verify" into a unified agent loop, it enables end-to-end automation of complex software tasks from initial understanding to final delivery.

The agentic loop is powered by five concrete abilities on top of image and video understanding:

  • Deep reasoning , multi-step planning before acting
  • Self-programming , the model writes and revises its own code
  • Tool invocation , it calls external functions or APIs
  • Verification and testing , it runs outputs and checks results
  • Autonomous iteration , it loops until the task is done
Alpha Signal

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

  • Full access to in-depth AI research breakdowns
  • Be the first to know what's trending before it hits mainstream
  • Daily curated papers, repos, and industry moves