What This Skill Does

The qwen-audio-lab skill is a comprehensive audio processing powerhouse for macOS users and developers utilizing Aliyun Qwen services. It provides a hybrid approach to speech synthesis, offering both high-speed local macOS voice generation via the mac-say engine and high-fidelity, customizable AI voices through the qwen-tts engine. This skill allows users to convert text into human-like narration, clone specific voice profiles for consistent branding or personal use, and design entirely new voices based on descriptive text prompts. It is designed to handle a wide range of media workflows, including turning long documents into audio files, creating narrated slideshows, and providing naturalistic text-to-speech for interactive applications.

Installation

To add this skill to your OpenClaw environment, execute the following command in your terminal:

clawhub install openclaw/skills/skills/aliyx/qwen-audio-lab

Ensure that you have your DASHSCOPE_API_KEY configured in your environment variables to enable the advanced Qwen synthesis features. If the API key is absent, the skill will gracefully degrade to using local macOS speech synthesis.

Use Cases

Automated Narration: Quickly convert lengthy project documentation or research scripts into high-quality audio narration for presentations.
Voice Branding: Clone a specific voice from a clean reference audio sample to maintain a consistent tone across various media projects.
Dynamic Presentation Audio: Automatically generate synchronized speaker-note voiceovers for PowerPoint presentations.
Accessible Communication: Transform text-heavy communications into spoken audio for accessibility or multitasking efficiency.

Example Prompts

"Can you narrate the contents of my file at /documents/script.txt using the Cherry voice?"
"I have an audio file of a speech here: /audio/sample.mp3. Please clone this voice and name it 'project-lead-voice'."
"Create a voice that sounds like a calm, professional male radio host for my upcoming documentary, save it as 'radio-doc-01', and generate a preview."

Tips & Limitations

Quality Control: For voice cloning, always provide a clean, high-quality audio sample without background noise to ensure the highest output fidelity.
Ethical Use: When cloning third-party voices, ensure you have explicit consent to avoid violating personal or intellectual property rights.
Storage: By default, generated files and local states are saved in ~/.openclaw/data/qwen-audio-lab/. You can change these paths using the QWEN_AUDIO_OUTPUT_DIR and QWEN_AUDIO_STATE_DIR variables.
Performance: Use mac-say for quick local feedback where speed is prioritized over naturalness, and reserve qwen-tts for high-quality professional output.

qwen-audio-lab

Install via CLI (Recommended)

What This Skill Does

Installation

Use Cases

Example Prompts

Tips & Limitations

Metadata

Tags(AI)