Official Verified

nano-banana-pro

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/pauldelavallaz/morfeo-nano-banana-pro

Download Source Code (.zip)

Nano Banana Pro Image Generation & Editing

Generate new images or edit existing ones using Google's Nano Banana Pro API (Gemini 3 Pro Image).

API Technical Specification

Endpoints & Authentication

Google AI Studio (Public Preview):

POST https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent?key=${API_KEY}

Vertex AI (Enterprise):

POST https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}/publishers/google/models/gemini-3-pro-image-preview:predict

Model IDs

API: gemini-3-pro-image-preview
SDK interno: nanobanana-pro-001

Parameters

Parameter	Values	Description
`aspect_ratio`	`1:1`, `4:3`, `3:4`, `16:9`, `9:16`	Output aspect ratio
`output_mime_type`	`image/png`, `image/jpeg`	Output format
`reference_images`	Array (max 14)	Reference images for consistency
`reference_type`	`CHARACTER`, `STYLE`, `SUBJECT`	How to use reference
`person_generation`	`ALLOW_ADULT`, `DONT_ALLOW`, `FILTER_SENSITIVE`	Person generation policy
`image_size`	`1K`, `2K`, `4K`	Output resolution

Reference Types

STYLE: Transfer visual style, color palette, mood from reference
CHARACTER: Maintain facial features, traits consistency across images
SUBJECT: Keep the subject/product consistent (use for product photography!)

Advanced Capabilities

Text Rendering: Native text rendering without spelling errors
In-context Editing: Send existing image + modification prompt (automatic in-painting)
High Resolution: Native upscale to 4K via upscale: true

Usage

Run the script using absolute path (do NOT cd to skill directory first):

Generate new image:

uv run ~/.clawdbot/skills/nano-banana-pro/scripts/generate_image.py \
  --prompt "your image description" \
  --filename "output-name.png" \
  [--resolution 1K|2K|4K] \
  [--api-key KEY]

Edit existing image:

uv run ~/.clawdbot/skills/nano-banana-pro/scripts/generate_image.py \
  --prompt "editing instructions" \
  --filename "output-name.png" \
  --input-image "path/to/input.png" \
  [--resolution 1K|2K|4K]

With reference image (product/style/character consistency):

uv run ~/.clawdbot/skills/nano-banana-pro/scripts/generate_image.py \
  --prompt "your description" \
  --filename "output-name.png" \
  --reference-image "path/to/reference.jpg" \
  --reference-type SUBJECT|STYLE|CHARACTER \
  [--resolution 1K|2K|4K]

Important: Always run from the user's current working directory so images are saved where the user is working, not in the skill directory.

Resolution Options

1K (default) - ~1024px resolution
2K - ~2048px resolution (recommended for most uses)
4K - ~4096px resolution (high quality)

Read Full Documentation on GitHub

Metadata

Author@pauldelavallaz

Stars1217

Updated2026-02-20

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-pauldelavallaz-morfeo-nano-banana-pro": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.

Related Skills

morpheus-fashion-design

Generate professional advertising images with AI models holding/wearing products. ✅ USE WHEN: - Need a person/model in the image WITH a product - Creating fashion ads, product campaigns, commercial photography - Want consistent model face across multiple shots - Need professional lighting/camera simulation - Input: product image + model reference (or catalog) ❌ DON'T USE WHEN: - Just editing/modifying an existing image → use nano-banana-pro - Product-only shot without a person → use nano-banana-pro - Already have the hero image, need variations → use multishot-ugc - Need video, not image → use veed-ugc after generating image - URL-based product fetch with brand profile → use ad-ready instead OUTPUT: Single high-quality PNG image (2K-4K resolution)

pauldelavallaz 1217

veed-ugc

Generate UGC-style promotional videos with AI lip-sync. Takes an image (person with product from Morpheus/Ad-Ready) and a script (pure dialogue), creates a video of the person speaking. Uses ElevenLabs for voice synthesis.

pauldelavallaz 1217

ugc-manual

Generate lip-sync video from image + user's own audio recording. ✅ USE WHEN: - User provides their OWN audio file (voice recording) - Want to sync image to specific audio/voice - User recorded the script themselves - Need exact audio timing preserved ❌ DON'T USE WHEN: - User provides text script (not audio) → use veed-ugc - Need AI to generate the voice → use veed-ugc - Don't have audio file yet → use veed-ugc with script INPUT: Image + audio file (user's recording) OUTPUT: MP4 video with lip-sync to provided audio KEY DIFFERENCE: veed-ugc = script → AI voice → video ugc-manual = user audio → video (no voice generation)

pauldelavallaz 1217

sora

Generate videos using OpenAI's Sora API. Use when the user asks to generate, create, or make videos from text prompts or reference images. Supports image-to-video generation with automatic resizing.

pauldelavallaz 1217

sora

Generate videos from text prompts or reference images using OpenAI Sora. ✅ USE WHEN: - Need AI-generated video from text description - Want image-to-video (animate a still image) - Creating cinematic/artistic video content - Need motion/animation without lip-sync ❌ DON'T USE WHEN: - Need lip-sync (person speaking) → use veed-ugc or ugc-manual - Just need image generation → use nano-banana-pro or morpheus - Editing existing videos → use Remotion - Need UGC-style talking head → use veed-ugc INPUT: Text prompt + optional reference image OUTPUT: MP4 video (various resolutions/durations)

pauldelavallaz 1217