qwen-audio-lab
Hybrid text-to-speech, reusable voice cloning, and narrated audio generation for macOS plus Aliyun Qwen. Use when the user wants to convert text into speech, clone and reuse a voice from a reference recording, generate narration files from plain text or text files, or create PPT speaker-note voiceovers.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/aliyx/qwen-audio-labWhat This Skill Does
The qwen-audio-lab skill is a comprehensive audio processing powerhouse for macOS users and developers utilizing Aliyun Qwen services. It provides a hybrid approach to speech synthesis, offering both high-speed local macOS voice generation via the mac-say engine and high-fidelity, customizable AI voices through the qwen-tts engine. This skill allows users to convert text into human-like narration, clone specific voice profiles for consistent branding or personal use, and design entirely new voices based on descriptive text prompts. It is designed to handle a wide range of media workflows, including turning long documents into audio files, creating narrated slideshows, and providing naturalistic text-to-speech for interactive applications.
Installation
To add this skill to your OpenClaw environment, execute the following command in your terminal:
clawhub install openclaw/skills/skills/aliyx/qwen-audio-lab
Ensure that you have your DASHSCOPE_API_KEY configured in your environment variables to enable the advanced Qwen synthesis features. If the API key is absent, the skill will gracefully degrade to using local macOS speech synthesis.
Use Cases
- Automated Narration: Quickly convert lengthy project documentation or research scripts into high-quality audio narration for presentations.
- Voice Branding: Clone a specific voice from a clean reference audio sample to maintain a consistent tone across various media projects.
- Dynamic Presentation Audio: Automatically generate synchronized speaker-note voiceovers for PowerPoint presentations.
- Accessible Communication: Transform text-heavy communications into spoken audio for accessibility or multitasking efficiency.
Example Prompts
- "Can you narrate the contents of my file at /documents/script.txt using the Cherry voice?"
- "I have an audio file of a speech here: /audio/sample.mp3. Please clone this voice and name it 'project-lead-voice'."
- "Create a voice that sounds like a calm, professional male radio host for my upcoming documentary, save it as 'radio-doc-01', and generate a preview."
Tips & Limitations
- Quality Control: For voice cloning, always provide a clean, high-quality audio sample without background noise to ensure the highest output fidelity.
- Ethical Use: When cloning third-party voices, ensure you have explicit consent to avoid violating personal or intellectual property rights.
- Storage: By default, generated files and local states are saved in
~/.openclaw/data/qwen-audio-lab/. You can change these paths using theQWEN_AUDIO_OUTPUT_DIRandQWEN_AUDIO_STATE_DIRvariables. - Performance: Use
mac-sayfor quick local feedback where speed is prioritized over naturalness, and reserveqwen-ttsfor high-quality professional output.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-aliyx-qwen-audio-lab": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, external-api, code-execution