ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

siliconflow-media

SiliconFlow 多模态服务,支持图片生成(FLUX/Qwen)、视频生成(Wan)、TTS语音合成、ASR语音识别。使用代金券支付。

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/axdlee/siliconflow-media
Or

What This Skill Does

The SiliconFlow Media skill is a comprehensive multi-modal AI interface designed to streamline generative media workflows directly within OpenClaw. It serves as a unified bridge to the SiliconFlow API, offering high-performance tools for image generation (supporting FLUX and Qwen models), video synthesis (using the Wan-AI suite), text-to-speech (TTS) conversion, and automated speech recognition (ASR). By utilizing pre-allocated vouchers, this skill allows users to integrate high-quality AI creative outputs into their automation pipelines without manual payment handling.

Installation

To integrate this skill into your OpenClaw environment, execute the following command in your terminal: clawhub install openclaw/skills/skills/axdlee/siliconflow-media Ensure you have configured your SILICONFLOW_API_KEY in your environment variables before running any script to authenticate your requests successfully.

Use Cases

This skill is ideal for content creators and developers seeking programmatic media production. You can use it to generate custom assets for marketing, automate video narrations using varied voice synthesis models, transcribe meeting recordings or audio clips through advanced speech-to-text models, or perform batch image generation tasks. It is particularly powerful when used in multi-step automation sequences—for example, converting text inputs into audio files and then merging them into generated video clips.

Example Prompts

  1. "Generate a high-quality image of a futuristic city using the FLUX model and save it as city_concept.png."
  2. "Convert this text file into an MP3 audio clip using the Fish Speech model."
  3. "Transcribe the audio file recording.mp3 using the SenseVoice model and save the output."

Tips & Limitations

  • Performance: While image generation is rapid (5-10 seconds), video generation is a resource-intensive process that can take up to 5 minutes per request. Patience is required for larger media tasks.
  • File Handling: All scripts automatically output a 'MEDIA:' log line, which the OpenClaw agent uses to automatically attach the resulting file to your chat interface.
  • Cost: All operations are charged against your existing voucher balance (currently 3000+), ensuring a frictionless experience for frequent users.

Metadata

Author@axdlee
Stars4473
Views0
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-axdlee-siliconflow-media": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#generative-ai#multimedia#tts#computer-vision
Safety Score: 4/5

Flags: file-write, file-read, external-api