Google has launched a new image generation and editing model named Nano Banana Pro, officially called Gemini 3 Pro Image. The model is built on the company’s recently released Gemini 3 Pro large language model and introduces a range of new capabilities for creating and modifying images with a high degree of control and accuracy. The release targets a broad audience, from casual users and students to professional creatives, developers, and enterprise customers.
According to Google’s announcement, Nano Banana Pro utilizes the advanced reasoning and real-world knowledge of its underlying Gemini 3 Pro model to generate more contextually rich and accurate visuals. This allows users to create detailed infographics, diagrams, and educational explainers based on provided content or by drawing on real-world facts. The model can also connect to Google Search to incorporate real-time information, such as visualizing a weather forecast in a comic book style or creating a step-by-step infographic for a recipe.
One of the most significant advancements highlighted by Google is the model’s ability to render clear and legible text directly within images. This feature supports multiple languages, a capability attributed to Gemini 3 Pro’s enhanced multilingual reasoning. Users can create posters with taglines, mockups with detailed text, or designs with a wide variety of fonts, textures, and calligraphy. The model can also translate existing text within an image, allowing for the localization of content for international audiences. For example, a user could prompt the model to translate the text on beverage cans from English to Korean while keeping the rest of the design unchanged.
New Creative Controls and Capabilities
Nano Banana Pro introduces what Google describes as “studio-quality creative controls,” giving users more precise command over the final image. These controls include:
- Advanced Editing: Users can select and modify specific parts of an image, adjust camera angles, change the depth of field to shift focus, and apply sophisticated color grading.
- Lighting Transformation: The model can alter the lighting of a scene, for instance, changing a daytime photo to a nighttime scene or creating dramatic lighting effects like chiaroscuro.
- High-Resolution Output: Images can be generated in various aspect ratios and are available in 2K and 4K resolutions, making them suitable for a range of platforms from social media to print.
- Image Blending and Consistency: The model can combine multiple source images into a single, coherent composition. According to API documentation noted by developer Simon Willison, it can blend up to 14 reference images. It can also maintain the consistency and resemblance of up to five people across different scenes, a feature useful for storyboarding or creating campaign visuals.
Early tests from developers confirm these capabilities. Willison demonstrated the model’s ability to follow a complex, multi-step editing prompt on an image of a pancake skull, successfully adding specific garnishes and changing the background as instructed.
Availability Across Google’s Ecosystem
Google is integrating Nano Banana Pro across its suite of products and services. For consumers, the model is rolling out globally in the Gemini app for users who select the “Thinking” model (Gemini 3 Pro) to create images. Free-tier users receive a limited number of free generations before reverting to the previous Nano Banana model, while subscribers to Google AI Plus, Pro, and Ultra receive higher usage quotas.
The model is also being made available to professionals and enterprises:
- Google Ads: The image generation tools are being upgraded to Nano Banana Pro.
- Google Workspace: The model is rolling out to customers in Google Slides and Vids.
- Developers: It is available through the Gemini API, Google AI Studio, and Vertex AI for enterprise use.
- Creatives: Google AI Ultra subscribers can access it in Flow, an AI filmmaking tool.
The pricing for developers using the API is tiered. According to reports from VentureBeat and Simon Willison, a 4K image costs around 24 cents, while 1K or 2K images cost about 13.4 cents. Image inputs are priced separately. For high-volume use cases, this pricing is higher than some competitors, but may be justified for users requiring high resolution or specific enterprise features.
Identifying AI-Generated Content with SynthID
In its announcement, Google emphasized the importance of transparency for AI-generated content. All images created with Nano Banana Pro are embedded with SynthID, an imperceptible digital watermark. The company has also released a verification tool directly within the Gemini app, allowing users to upload an image and ask if it was generated by Google AI.
Simon Willison tested this feature by generating an image, editing out the visible watermark, and uploading it to the Gemini app. The app correctly identified that “all or part of this image was created with Google AI.”
While the invisible SynthID watermark is applied to all generated images, Google will maintain a visible watermark (a “sparkle” icon) on images generated by free and Google AI Pro users. Recognizing the need for a clean canvas for professional work, Google is removing this visible watermark for Google AI Ultra subscribers and for images created within the Google AI Studio developer tool.
Early Reception and Known Limitations
The release has generated significant positive reactions from the developer and AI communities. VentureBeat reported that users hailed the model’s capabilities as “absolutely bonkers,” with many sharing examples of complex infographics, medical illustrations, and detailed product mockups created with a single prompt. Its performance in rendering accurate text and structured layouts has been particularly praised. Developer Deedy Das called it “by far the best image model I’ve ever seen,” highlighting its ability to perform “Photoshop-like editing.”
However, Google and other testers have also noted current limitations. In a company blog post, Google acknowledged that rendering very small text or fine details can still be imperfect. The factual accuracy of data-driven visuals like diagrams should always be verified by the user. Furthermore, complex edits can sometimes produce unnatural results, and maintaining character consistency, while improved, may occasionally vary. AI researcher Lisan al Gaib demonstrated a key limitation in logical reasoning by showing the model failing to correctly generate or solve a Sudoku puzzle, underscoring that it is a visual tool, not an artificial general intelligence.
Sources: Google Blog, Google Blog, 9to5Google, Simon Willison, VentureBeat