ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

qwen-audio

High-performance audio library with text-to-speech (TTS) and speech-to-text (STT).

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/darknoah/qwen-audio
Or

What This Skill Does

Qwen-Audio is a powerful, high-performance library integrated into the OpenClaw ecosystem, specifically designed to bridge the gap between text and speech. It provides robust capabilities for both Text-to-Speech (TTS) and Speech-to-Text (STT) processing. At its core, the skill empowers users to create highly customized, reusable voice profiles using sophisticated AI models. By leveraging the VoiceDesign model, you can synthesize audio that matches specific emotional tones, genders, or professional requirements, making it an ideal tool for content creation, accessibility features, or personalized assistant interactions. The skill manages voice data locally within structured directories, ensuring high efficiency and data privacy.

Installation

To integrate this skill into your OpenClaw environment, execute the following command in your terminal:

clawhub install openclaw/skills/skills/darknoah/qwen-audio

Ensure that you have Python 3.10 or higher installed on your system. Before initializing any audio tasks, navigate to the skill root and verify that all prerequisites listed in ./references/env-check-list.md are satisfied to avoid runtime configuration errors.

Use Cases

  • Content Creation: Convert written scripts, blog posts, or long-form documents into natural-sounding audio files for podcasting or accessibility.
  • Custom Voice Branding: Create distinct, branded voice identities for automated customer support or interactive agents.
  • Meeting Transcription: Utilize the STT capabilities to transcribe audio recordings into clean, formatted text logs for documentation.
  • Interactive AI: Add a voice interface to your custom OpenClaw agents, allowing them to communicate verbally with users.

Example Prompts

  1. "I want to create a new voice for my assistant. I'm looking for a warm, professional female voice. Can you help me set that up?"
  2. "List all the available voice profiles currently stored in the qwen-audio skill."
  3. "Convert this text document into an audio file using my 'broadcast-pro' voice profile."

Tips & Limitations

  • Pre-check requirement: Always run the voice list command before attempting a tts generation. If no voices exist, you must create one first; the agent will guide you through this if you ask.
  • Performance: For optimal results, ensure your reference audio files are clean and free of background noise, as these serve as the foundation for the AI's voice synthesis quality.
  • Resource Management: Since voices are stored in local directories, monitor your storage if you create a high volume of unique profiles.

Metadata

Author@darknoah
Stars3376
Views0
Updated2026-03-24
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-darknoah-qwen-audio": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#audio#tts#stt#voice-synthesis#ai-audio
Safety Score: 4/5

Flags: file-read, file-write, code-execution