wan-t2i
阿里云DashScope Wan2.6文生图工具。使用阿里云百炼平台的Wan2.6-t2i模型生成图片。 当用户需要:AI生成图片、文生图、从文字生成图像时触发。 需要DASHSCOPE_API_KEY环境变量(已在系统中配置)。
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/baokui/wan-text2imageWhat This Skill Does
The wan-t2i skill is a powerful integration for OpenClaw that enables high-quality text-to-image generation powered by Alibaba Cloud's Wan2.6-t2i model. This model, hosted on the DashScope platform, is designed to interpret natural language prompts and synthesize detailed, aesthetically pleasing images. Whether you are looking for realistic landscape photography, character illustrations, or conceptual art, this tool translates your creative vision into visual assets directly within your workspace.
Installation
To integrate this tool into your environment, use the OpenClaw repository manager. Execute the following command in your terminal:
clawhub install openclaw/skills/skills/baokui/wan-text2image
Ensure that your DASHSCOPE_API_KEY is correctly set as an environment variable, as the tool relies on this key to authenticate with Alibaba Cloud's backend services.
Use Cases
This skill is ideal for professionals and hobbyists who need rapid visual prototyping. Use it when you need to generate images for slide decks, social media content, creative writing visualization, or UI/UX design mood boards. Because it supports customizable aspect ratios, it is equally effective for creating standard square images, portrait mobile wallpapers, or landscape hero images for web design.
Example Prompts
- "请帮我生成一张图片,画面是一只在赛博朋克城市街道中奔跑的机械猫,色彩绚丽。"
- "生成一张风景图片:静谧的深山湖泊,倒映着雪山,分辨率设置为720*1280。"
- "画一个穿着简约白色衬衫的女性在书店看书,背景模糊,不要出现多余的人,使用默认尺寸。"
Tips & Limitations
To get the best results, use descriptive Chinese prompts, as the Wan2.6 model is optimized for Chinese input. When specifying parameters, ensure the dimensions follow the required formats: 12801280, 7201280, or 1280*720. Remember that the output is generated by a remote API; hence, a stable internet connection is required. The tool automatically handles prompt extension, so keep your initial input focused on key visual elements to help the model iterate more effectively. Note that this tool interacts with external services, so sensitive personal information should not be included in your prompts.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-baokui-wan-text2image": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, external-api, code-execution
Related Skills
pdf-process-mineru
PDF document parsing tool based on local MinerU, supports converting PDF to Markdown, JSON, and other machine-readable formats.
Pdf Ocr Layout
Skill by baokui
llm-video-generator
Generate videos from text descriptions using ZhipuAI CogVideoX-3 model. Supports text-to-video, image-to-video, and first/last frame-to-video generation. Automatically handles long videos (over 5s) by chaining multiple generation calls with last-frame continuation. Use when the user asks to create/generate a video from text, make a video, text-to-video, 文生视频, 生成视频, 做个视频, or any request involving converting text/images into a video. Supports configuring video content, style, resolution (up to 4K), frame rate (30/60fps), audio, and duration.
glm-v-model
智谱 GLM-4V/4.6V 视觉模型调用技能。用于图像/视频理解、多模态对话、图表分析等任务。 当用户提到:图片理解、图像识别、视觉模型、GLM-4V、GLM-4.6V、多模态分析、看图说话、图表分析、视频理解时使用此技能。