What This Skill Does

The gemini-image-proxy skill provides a streamlined interface for interacting with high-end image generation models like Gemini 3 Pro Image. By leveraging the industry-standard OpenAI Python SDK, this skill abstracts away the complex configuration usually associated with proprietary Google AI endpoints. It allows users to perform two primary functions: generating brand-new images from text prompts and modifying existing local images using natural language instructions. Because it relies on the OpenAI SDK, the integration is highly portable, lightweight, and works seamlessly in resource-constrained environments like free-tier Fly.io instances or restricted containerized systems.

Installation

To get started, ensure you have Python 3.10 or newer installed on your machine. Install the required dependency via pip: python3 -m pip install openai. Once installed, configure your environment by exporting your API credentials: export GOOGLE_PROXY_API_KEY="your_api_key" and setting your endpoint via export GOOGLE_PROXY_BASE_URL="https://example.com/v1". These steps ensure the script can securely authenticate and route requests to the model provider.

Use Cases

This skill is ideal for rapid prototyping, content creation, and automated image post-processing. Use it to generate photorealistic assets for web design, create custom icons, or iterate on existing visual content. Its ability to perform edits (such as changing the lighting, art style, or background of an image) makes it a powerful tool for developers integrating AI-driven visual workflows into their own applications without needing to manage heavy dependencies like Pillow or specialized Google-specific client libraries.

Example Prompts

"Generate a high-resolution, photorealistic image of a futuristic cyberpunk city skyline at night with neon blue and purple lights."
"Edit this portrait.png file to change the background from a simple office wall to a lush tropical forest while maintaining the subject's lighting."
"Create a minimalist vector-style icon of a cup of coffee with a steam swirl, white background, suitable for a mobile app user interface."

Tips & Limitations

To achieve the best results, use descriptive adjectives in your prompts. While the script supports PNG, JPG, JPEG, GIF, and WEBP, larger file sizes may increase processing time. Ensure your environment variables are correctly set before execution to avoid authentication errors. Note that the model defaults to Gemini 3 Pro Image, but you can modify the source script to leverage other supported models like Imagen 4.0 or Gemini 2.5 Flash if your specific use case requires different latency or quality characteristics.

gemini-image-proxy

Why use this skill?

Install via CLI (Recommended)

What This Skill Does

Installation

Use Cases

Example Prompts

Tips & Limitations

Metadata

Tags(AI)