Official Verified

wavespeed-wan-26

Generate videos using Alibaba's Wan 2.6 model via WaveSpeed AI. Supports text-to-video and image-to-video generation with up to 15 seconds duration at 720p or 1080p. Features audio-guided generation, prompt expansion, multi-shot mode, and configurable seeds. Use when the user wants to create videos from text prompts or animate images.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/chengzeyi/wavespeed-wan-26

Download Source Code (.zip)

WaveSpeedAI Wan 2.6 Video Generation

Generate videos using Alibaba's Wan 2.6 model via the WaveSpeed AI platform. Supports both text-to-video and image-to-video generation with up to 15 seconds of video at up to 1080p resolution.

Authentication

export WAVESPEED_API_KEY="your-api-key"

Get your API key at wavespeed.ai/accesskey.

Quick Start

Text-to-Video

import wavespeed from 'wavespeed';

const output_url = (await wavespeed.run(
  "alibaba/wan-2.6/text-to-video",
  { prompt: "A golden retriever running through a field of sunflowers at sunset" }
))["outputs"][0];

Image-to-Video

The image parameter accepts an image URL. If you have a local file, upload it first with wavespeed.upload() to get a URL.

import wavespeed from 'wavespeed';

// Upload a local image to get a URL
const imageUrl = await wavespeed.upload("/path/to/photo.png");

const output_url = (await wavespeed.run(
  "alibaba/wan-2.6/image-to-video",
  {
    image: imageUrl,
    prompt: "The person in the photo slowly turns and smiles"
  }
))["outputs"][0];

You can also pass an existing image URL directly:

const output_url = (await wavespeed.run(
  "alibaba/wan-2.6/image-to-video",
  {
    image: "https://example.com/photo.jpg",
    prompt: "The person in the photo slowly turns and smiles"
  }
))["outputs"][0];

API Endpoints

Text-to-Video

Model ID: alibaba/wan-2.6/text-to-video

Generate videos from text prompts.

Parameters

Parameter	Type	Required	Default	Description
`prompt`	string	Yes	--	Text description of the video to generate
`negative_prompt`	string	No	--	Text description of what to avoid in the video
`audio`	string	No	--	Audio URL to guide generation
`size`	string	No	`1280*720`	Output size in pixels. One of: `1280720`, `7201280`, `19201080`, `10801920`
`duration`	integer	No	`5`	Video duration in seconds. One of: `5`, `10`, `15`
`shot_type`	string	No	`single`	Shot type. One of: `single`, `multi`
`enable_prompt_expansion`	boolean	No	`false`	Enable prompt optimizer for enhanced prompts
`seed`	integer	No	`-1`	Random seed (-1 for random). Range: -1 to 2147483647

Example

import wavespeed from 'wavespeed';

const output_url = (await wavespeed.run(
  "alibaba/wan-2.6/text-to-video",
  {
    prompt: "A timelapse of a city skyline transitioning from day to night, cinematic",
    negative_prompt: "blurry, low quality, distorted",
    size: "1920*1080",
    duration: 10,
    shot_type: "single",
    seed: 42
  }
))["outputs"][0];

Image-to-Video

Model ID: alibaba/wan-2.6/image-to-video

Animate a source image into a video using a text prompt.

Parameters

Read Full Documentation on GitHub

Metadata

Author@chengzeyi

Stars3840

Updated2026-04-06

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-chengzeyi-wavespeed-wan-26": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.

Related Skills

wavespeed-watermark-remover

Remove watermarks, logos, captions, and text overlays from images and videos using WaveSpeed AI. Intelligently detects and removes watermarks while preserving texture and background. Supports images and videos up to 10 minutes. Use when the user wants to remove watermarks or text overlays from media.

chengzeyi 3840

wavespeed-face-swapper

Swap faces in images and videos using WaveSpeed AI. Supports image face swap and video face swap with multi-face targeting. Produces watermark-free results with automatic lighting and skin tone adaptation. Use when the user wants to replace a face in an image or video with another face.

chengzeyi 3840

wavespeed-infinitetalk

Generate talking head videos from a portrait image and audio using WaveSpeed AI's InfiniteTalk model. Produces lip-synced video up to 10 minutes long at 480p or 720p. Supports optional mask images to target specific faces and text prompts for additional guidance. Use when the user wants to animate a face with audio or create talking avatar videos.

chengzeyi 3840

wavespeed-minimax-speech-26

Convert text to speech using MiniMax Speech 2.6 Turbo via WaveSpeed AI. Features ultra-human voice cloning, sub-250ms latency, 40+ languages, emotion control, and 200+ voice presets. Use when the user wants to generate speech audio from text.

chengzeyi 3840

wavespeed-nano-banana-2

Generate and edit images using Google's Nano Banana 2 model via WaveSpeed AI. Supports text-to-image generation and image editing with natural language prompts. Features native 4K resolution, flexible aspect ratios including ultra-narrow (1:8, 8:1), multilingual text rendering, and camera-style controls. Use when the user wants to create images from text or edit existing images.

chengzeyi 3840