qwen3-tts
High-quality text-to-speech using Qwen3-TTS. 10 built-in speakers with emotional instruct control, voice cloning (3s of audio), natural-language voice design, 10+ languages, persistent named voices, and delivering audio via Telegram/WhatsApp as native voice messages. Auto-detects GPU hardware (CUDA, ROCm, Intel XPU, CPU).
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/damustermann/claw-qwen3-ttsQwen3-TTS Skill
You have access to a powerful text-to-speech system that can generate human-quality speech with 10 built-in speakers, design new voices from descriptions, clone existing voices from audio samples, and send audio via Telegram/WhatsApp as native voice messages.
First-Time Setup
If the skill is not yet installed (no ~/clawd/skills/qwen3-tts directory), run:
bash <(curl -fsSL https://raw.githubusercontent.com/daMustermann/claw-qwen3-tts/main/install.sh)
Or if already cloned but not set up (no .venv/ directory):
bash ~/clawd/skills/qwen3-tts/install.sh
This auto-detects the GPU (CUDA, ROCm, Intel XPU, or CPU-only), creates a Python venv, and installs all dependencies. It takes 5–15 minutes on first run.
Starting & Stopping the Server
Before any TTS operation, ensure the server is running:
# Start (idempotent — won't restart if already running)
bash ~/clawd/skills/qwen3-tts/scripts/start_server.sh
# Check health
bash ~/clawd/skills/qwen3-tts/scripts/health_check.sh
# Stop (when done)
bash ~/clawd/skills/qwen3-tts/scripts/stop_server.sh
The server runs at http://localhost:8880.
Available Models
| Model ID | Use Case | Notes |
|---|---|---|
custom-voice-1.7b | High-quality TTS with built-in speakers — default | Best quality, ~5 GB VRAM |
custom-voice-0.6b | Fast TTS with built-in speakers | Lightweight, ~2 GB VRAM |
voice-design | Design new voices from natural language descriptions | Uses VoiceDesign model |
base-1.7b | Basic TTS (auto-corrected to custom-voice-1.7b) | Use custom-voice-* instead |
base-0.6b | Basic TTS (auto-corrected to custom-voice-0.6b) | Use custom-voice-* instead |
Important: On the
/v1/audio/speechendpoint,base-*andvoice-designmodels are automatically corrected to the correspondingcustom-voice-*model. Always prefercustom-voice-1.7borcustom-voice-0.6bfor speech generation.
Built-in Speakers
The custom-voice-* models include 10 built-in voices:
Chelsie · Ethan · Aidan · Serena · Ryan · Vivian · Claire · Lucas · Eleanor · Benjamin
You can discover speakers dynamically: curl http://localhost:8880/v1/speakers
Capabilities
1. Generate Speech from Text
When to use: User asks to speak text, read something aloud, generate audio, do a voiceover, narrate, or say something.
curl -X POST http://localhost:8880/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{
"model": "custom-voice-1.7b",
"input": "TEXT_HERE",
"voice": "default",
"speaker": "Chelsie",
"language": "en",
"instruct": "",
"response_format": "wav"
}' \
--output ~/clawd/skills/qwen3-tts/output/speech.wav
Parameters:
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-damustermann-claw-qwen3-tts": {
"enabled": true,
"auto_update": true
}
}
}Tags
Related Skills
narrator-ai-cli
Create AI-narrated film/drama commentary videos via CLI. Two workflow paths (Original & Adapted narration), 100+ movies, 146 BGM tracks, 63 dubbing voices in 11 languages, 90+ narration templates. Use when creating narration videos, film commentary, short drama dubbing, or video production.
Lead Radar
Every morning, scans Reddit, Hacker News, Indie Hackers, Stack Overflow, Quora, Hashnode, Dev.to, GitHub, and Lobsters for people actively asking for what you sell. Delivers the top 10 buying-intent leads to your Telegram with a pre-drafted reply. Powered by Gemini 2.5 Flash.
narrator-ai-cli
AI电影解说视频自动生成技能(AI解说大师 CLI Skill)。当用户需要创建电影解说视频、短剧解说、影视二创、AI配音旁白视频、film commentary、video narration、drama dubbing、movie narration时触发。内置93部电影素材、146首BGM、63种配音音色(11种语言)、90+解说模板。通过narrator-ai-cli命令行工具实现:搜片选片→选择模板→选BGM→选配音→生成文案→合成视频的全流程自动化。CLI client for Narrator AI (AI解说大师) video narration API. Use when user needs to create AI narration videos, manage narration tasks, browse dubbing/BGM/material resources, or automate video production.
podcast-agent
Search articles on any topic, generate a two-host dialogue script, and synthesize podcast audio via TTS. Turn long reads into listenable content.
agent3-hub
Universal AI resource registry — search and invoke agents, MCP servers, and APIs through a single MCP endpoint. Includes Telegram content search, Google search, X/Twitter search, and more.