piper-tts
Local text-to-speech using Piper for voice message delivery. Use when the user asks for voice responses, audio messages, TTS, text-to-speech, voice notes, or wants to hear something spoken aloud. Converts text to speech locally (no cloud APIs, no cost, no latency) and delivers as voice messages on Telegram, Discord, or any channel supporting audio.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/bewareofddog/beware-piper-ttsWhat This Skill Does
The piper-tts skill integrates the Piper text-to-speech engine directly into your OpenClaw agent. By leveraging local neural processing, this skill enables your agent to generate high-quality, human-like audio responses without relying on external cloud APIs. This ensures zero latency, zero cost, and complete privacy, as no audio data is transmitted to third-party servers. When invoked, the skill processes text input, converts it to an audio stream, and outputs an MP3 file that the agent then delivers as a native voice message on platforms like Telegram or Discord. It is an ideal solution for users who prefer listening to responses rather than reading them.
Installation
To begin, ensure you have Python 3.9+ installed on your system. Navigate to your OpenClaw root directory and execute the setup script: scripts/setup-piper.sh. This command automates the installation of the necessary Python dependencies and downloads the default en_US-kusal-medium voice model. If you wish to expand your voice library, you can install additional models by providing the voice name as an argument to the setup script, such as scripts/setup-piper.sh --voice en_US-ryan-high.
Use Cases
This skill is perfect for scenarios where accessibility and convenience are paramount. It is highly effective for delivering long-form answers while the user is commuting, summarizing complex technical data into an audio brief, or providing a more personal, interactive feel during conversational tasks. It is best used on an ad-hoc basis when a user explicitly requests audio, rather than as a forced, global response setting.
Example Prompts
- "Can you explain how this code works, but send it as a voice note so I can listen while walking?"
- "Tell me a funny joke to brighten my mood, and please use the voice message format."
- "Summarize the latest news headlines for me. I'd prefer to hear it in a British accent."
Tips & Limitations
Piper is exceptionally fast, typically generating audio within one second. To maintain optimal system performance, do not set messages.tts.auto: "always" in your configuration, as this will force every response to incur processing time. Instead, keep TTS usage intentional. Be aware that while Piper is lightweight, it requires local disk space for voice models. If you encounter errors, ensure your system PATH is correctly configured to locate the Piper binaries. Since this runs locally, it is restricted by your local machine's hardware capabilities, though it is optimized for both Apple Silicon and Linux environments.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-bewareofddog-beware-piper-tts": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, code-execution