ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 3/5

qwen-audio-lab

Hybrid text-to-speech, reusable voice cloning, and narrated audio generation for macOS plus Aliyun Qwen. Use when the user wants to convert text into speech, clone and reuse a voice from a reference recording, generate narration files from plain text or text files, or create PPT speaker-note voiceovers.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/aliyx/qwen-audio-lab
Or

What This Skill Does

The qwen-audio-lab skill is a comprehensive audio processing powerhouse for macOS users and developers utilizing Aliyun Qwen services. It provides a hybrid approach to speech synthesis, offering both high-speed local macOS voice generation via the mac-say engine and high-fidelity, customizable AI voices through the qwen-tts engine. This skill allows users to convert text into human-like narration, clone specific voice profiles for consistent branding or personal use, and design entirely new voices based on descriptive text prompts. It is designed to handle a wide range of media workflows, including turning long documents into audio files, creating narrated slideshows, and providing naturalistic text-to-speech for interactive applications.

Installation

To add this skill to your OpenClaw environment, execute the following command in your terminal:

clawhub install openclaw/skills/skills/aliyx/qwen-audio-lab

Ensure that you have your DASHSCOPE_API_KEY configured in your environment variables to enable the advanced Qwen synthesis features. If the API key is absent, the skill will gracefully degrade to using local macOS speech synthesis.

Use Cases

  1. Automated Narration: Quickly convert lengthy project documentation or research scripts into high-quality audio narration for presentations.
  2. Voice Branding: Clone a specific voice from a clean reference audio sample to maintain a consistent tone across various media projects.
  3. Dynamic Presentation Audio: Automatically generate synchronized speaker-note voiceovers for PowerPoint presentations.
  4. Accessible Communication: Transform text-heavy communications into spoken audio for accessibility or multitasking efficiency.

Example Prompts

  • "Can you narrate the contents of my file at /documents/script.txt using the Cherry voice?"
  • "I have an audio file of a speech here: /audio/sample.mp3. Please clone this voice and name it 'project-lead-voice'."
  • "Create a voice that sounds like a calm, professional male radio host for my upcoming documentary, save it as 'radio-doc-01', and generate a preview."

Tips & Limitations

  • Quality Control: For voice cloning, always provide a clean, high-quality audio sample without background noise to ensure the highest output fidelity.
  • Ethical Use: When cloning third-party voices, ensure you have explicit consent to avoid violating personal or intellectual property rights.
  • Storage: By default, generated files and local states are saved in ~/.openclaw/data/qwen-audio-lab/. You can change these paths using the QWEN_AUDIO_OUTPUT_DIR and QWEN_AUDIO_STATE_DIR variables.
  • Performance: Use mac-say for quick local feedback where speed is prioritized over naturalness, and reserve qwen-tts for high-quality professional output.

Metadata

Author@aliyx
Stars4473
Views2
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-aliyx-qwen-audio-lab": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#tts#voice-cloning#macos#qwen#narration
Safety Score: 3/5

Flags: file-write, file-read, external-api, code-execution