mlx-tts
Text-To-Speech with MLX (Apple Silicon) and opensource models (default QWen3-TTS) locally.
Why use this skill?
Generate high-quality, free text-to-speech locally on your Mac. No API keys or cloud servers required. Fast, private, and optimized for Apple Silicon.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/guoqiao/mlx-ttsWhat This Skill Does
The mlx-tts skill provides high-performance, local text-to-speech generation specifically optimized for Apple Silicon hardware. By leveraging the Apple MLX framework, this skill transforms text inputs into natural-sounding speech without needing external cloud APIs, expensive subscriptions, or an internet connection. It defaults to using the powerful QWen3-TTS model, ensuring that all audio synthesis happens directly on your machine, which guarantees data privacy and low latency. This is an essential utility for users who need to integrate voice feedback into their local AI workflows or who want to add audio capabilities to their agent without relying on paid services like OpenAI or ElevenLabs.
Installation
To integrate this skill into your environment, use the OpenClaw command-line interface. Run the following command in your terminal:
clawhub install openclaw/skills/skills/guoqiao/mlx-tts
Ensure that you have Homebrew installed on your macOS device, as the installation script utilizes it to manage necessary dependencies, specifically uv for Python environment management and mlx_audio for the heavy lifting of sound processing. Once installed, the skill will be ready to process text inputs and output audio files directly to your local file system.
Use Cases
This skill is perfect for various automation scenarios where voice output is desired. Use it to build accessibility tools, generate spoken summaries of long documents, create voiceover assets for media projects, or add natural-sounding conversational feedback to your own local AI agents. Because it runs locally, it is the ideal solution for processing sensitive information that you would not want to transmit over a network.
Example Prompts
- "Convert this article into a voice file so I can listen to it while I commute."
- "Read the following text back to me with a natural voice: [Paste text here]."
- "Summarize the current system status and say it out loud using the TTS engine."
Tips & Limitations
- Performance: The quality and speed of generation depend on your Apple Silicon chip (M1, M2, M3, etc.). Using this on Intel-based Macs is not supported.
- Disk Space: Ensure you have sufficient disk space as the initial setup may download large ML model weights to your local machine.
- Audio Files: The output format is typically
.ogg. If your specific application requires.wavor.mp3, you may need to chain this with a secondary audio conversion tool. - Privacy: Since this skill runs entirely locally, it is excellent for offline environments, but remember that the audio file itself will be saved to a temporary directory on your machine, so manage your file cleanup periodically.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-guoqiao-mlx-tts": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, code-execution
Related Skills
mlx-audio-server
Local 24x7 OpenAI-compatible API server for STT/TTS, powered by MLX on your Mac.
mlx-stt
Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.
dl
Download Video/Music from YouTube/Bilibili/X/etc.
url2pdf
Convert URL to PDF suitable for mobile reading.
uv-global
Provision and reuse a global uv environment for ad hoc Python scripts.