Google has unveiled an upgraded version of its AI model, Gemini 2.5 Flash, featuring a novel «thinking budget» tool. This innovation allows developers to control the level of computational reasoning the AI employs for various tasks, balancing quality, cost, and response time.
The original Gemini 2.5 model, launched in March 2025, was lauded for its advanced reasoning capabilities. However, the intensive computing demands of such models prompted Google to introduce the option to limit or disable reasoning to conserve resources.
Tulsee Doshi, Gemini’s Director of Product Management, emphasized that different queries require varying levels of reasoning. For instance, answering a factual question like «How many provinces does Canada have?» necessitates less reasoning than solving complex engineering problems. The «thinking budget» enables developers to fine-tune the AI’s responses accordingly.
This development aligns with broader trends in the AI industry, where companies like OpenAI and China’s DeepSeek are striving to optimize the efficiency of powerful reasoning models. By offering developers the ability to set a «thinking budget,» Google provides fine-grained control over the computational resources allocated to each task.
Gemini 2.5 Flash is currently available in preview, reflecting Google’s commitment to refining performance while managing computational costs. This move is part of Google’s broader strategy to enhance AI accessibility and efficiency across various applications.