ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

llm-video-generator

Generate videos from text descriptions using ZhipuAI CogVideoX-3 model. Supports text-to-video, image-to-video, and first/last frame-to-video generation. Automatically handles long videos (over 5s) by chaining multiple generation calls with last-frame continuation. Use when the user asks to create/generate a video from text, make a video, text-to-video, 文生视频, 生成视频, 做个视频, or any request involving converting text/images into a video. Supports configuring video content, style, resolution (up to 4K), frame rate (30/60fps), audio, and duration.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/baokui/llm-video-generator
Or

What This Skill Does

The llm-video-generator is a powerful OpenClaw AI agent skill designed to bridge the gap between text/image concepts and professional-grade video content. Utilizing the ZhipuAI CogVideoX-3 model, it acts as a creative director and engine, supporting text-to-video, image-to-video, and sequence-based generation. The skill intelligently manages long-form content by breaking down requests into 5-second segments, utilizing frame-continuation techniques to ensure visual stability, and concatenating the final output into a seamless video file.

Installation

To integrate this skill into your OpenClaw environment, execute the following command in your terminal: clawhub install openclaw/skills/skills/baokui/llm-video-generator Ensure you have the necessary environment permissions, as the skill utilizes /opt/anaconda3/bin/python3 for its internal scripts, including video concatenation and frame extraction tools.

Use Cases

  • Marketing & Social Media: Quickly generate short promotional clips or B-roll for social media content from simple text scripts.
  • Concept Visualization: Turn static images or storyboards into animated sequences to pitch design or film ideas.
  • Education & Training: Create visual aids for complex topics where a static image is insufficient to show the progression of a process.
  • Artistic Exploration: Generate surreal or high-fidelity cinematic clips based on artistic prompts or style descriptions.

Example Prompts

  1. "Make a 15-second cinematic video of a futuristic cyberpunk city at night with rain falling, 1080p, 30fps."
  2. "Generate a video showing a flower blooming in a desert, use this image [path/to/image.jpg] as the starting frame."
  3. "做个视频:一只可爱的小猫在草地上追逐蝴蝶,风格要温馨,时长10秒。"

Tips & Limitations

  • Patience is Key: High-definition video generation is resource-intensive. Always review the estimated time provided by the agent before starting.
  • Consistency: When generating segments for long videos, ensure your prompts for subsequent segments explicitly describe the state of the characters/objects from the end of the previous segment to maintain continuity.
  • Resolution: While 4K is supported, it significantly increases processing time. Use 1080p for draft versions to iterate faster.
  • Limitations: The model generates 5-second chunks; avoid requesting single-shot videos longer than 30 seconds to maintain optimal coherence.

Metadata

Author@baokui
Stars4473
Views1
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-baokui-llm-video-generator": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#video#generative-ai#creative#multimedia
Safety Score: 4/5

Flags: external-api, file-read, file-write, code-execution