openclaw-media-gen
Generate images & videos with AIsa. Gemini 3 Pro Image (image) + Qwen Wan 2.6 (video) via one API key.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/aisadocs/openclaw-aisa-llm-image-videoWhat This Skill Does
The OpenClaw Media Gen skill acts as a unified gateway to advanced multimodal AI generation. By leveraging the AIsa API ecosystem, it allows you to generate high-fidelity images using Google's Gemini 3 Pro model and cinematic videos using Alibaba's state-of-the-art Qwen Wan 2.6 (Tongyi Wanxiang) model. This skill simplifies the workflow by utilizing a single API key to bridge the gap between static image generation and dynamic video synthesis, enabling users to transform descriptive text into visual media effortlessly.
Installation
To integrate this skill into your OpenClaw environment, execute the following command in your terminal:
clawhub install openclaw/skills/skills/aisadocs/openclaw-aisa-llm-image-video
Ensure your AISA_API_KEY is configured as an environment variable before running any generation tasks to ensure the agent has proper authorization to access the AIsa endpoints.
Use Cases
This skill is designed for creators, developers, and researchers who need rapid visual prototyping. Use cases include:
- Marketing & Content Creation: Generate consistent social media assets and short-form video clips from simple text prompts.
- Concept Art: Quickly iterate on character designs or environmental textures using the Gemini image model.
- Cinematic Storyboarding: Turn static concepts into short video sequences using the Wan 2.6 video model, allowing for deeper exploration of movement and composition.
- Automated Workflow Integration: Embed media generation directly into your CLI-based development or automation pipelines.
Example Prompts
- "Generate a hyper-realistic photograph of a futuristic coffee shop in a neon-lit Tokyo street, cinematic lighting, 8k resolution."
- "Create a 5-second video from this image URL [https://url.com/image.jpg]: camera movement, slow zoom in on the subject, dramatic atmospheric fog, movie grade."
- "Draft a visual concept for a steampunk-style flying machine, then synthesize a short 5-second video showcasing it soaring through clouds."
Tips & Limitations
- Asynchronous Processing: Video generation is an asynchronous task. Always save your
task_idreturned from the initial call and use thevideo-statuscommand to check progress before expecting a result. - Image-to-Video Requirements: The video model performs best with high-quality source images. Ensure the provided
img_urlis publicly accessible and clear. - Cost Management: Be mindful of your API quota. Both Gemini 3 Pro and Qwen Wan 2.6 are premium models; frequent generation will consume your AIsa credits rapidly.
- Error Handling: Always ensure network connectivity to the AIsa API endpoints (https://api.aisa.one) to avoid task initiation failures.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-aisadocs-openclaw-aisa-llm-image-video": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: external-api, file-write
Related Skills
openclaw-media-gen
Generate images & videos with AIsa. Gemini 3 Pro Image (image) + Qwen Wan 2.6 (video) via one API key.
aisa-tavily
AI-optimized web search via AIsa's Tavily API proxy. Returns concise, relevant results for AI agents through AIsa's unified API gateway.
openclaw-search
Intelligent search for agents. Multi-source retrieval with confidence scoring - web, academic, and Tavily in one unified API.
Twitter Command Center (Search + Post)
Searches and reads X (Twitter): profiles, timelines, mentions, followers, tweet search, trends, lists, communities, and Spaces. Publishes posts after the user completes OAuth in the browser. Use when the user asks about Twitter/X data, social listening, or posting without sharing account passwords.
stock-rumors
Scan M&A, insider, analyst, social, and regulatory rumor signals through AISA. Use when: the user asks about early market signals, rumors, insider activity, analyst changes, or takeover chatter.