kokoro-tts
Generate spoken audio from text using the local Kokoro TTS engine. Use when the user asks to "say" something, requests a voice message, or wants text converted to speech.
Why use this skill?
Enhance your OpenClaw agent with the Kokoro TTS skill to generate natural-sounding AI speech from text. Easily customize voices and speed for interactive audio responses.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/edkief/kokoro-ttsWhat This Skill Does
The kokoro-tts skill provides OpenClaw with the ability to convert text into high-quality, natural-sounding synthetic speech using the efficient Kokoro TTS engine. By leveraging local or remote API instances, this skill enables your AI agent to produce voice responses rather than just text, significantly enhancing the interactive experience. The skill processes input text and transforms it into an audio file, which the OpenClaw system automatically detects and delivers to the user via the chat interface.
Installation
To integrate this capability into your OpenClaw environment, execute the following command in your terminal: clawhub install openclaw/skills/skills/edkief/kokoro-tts. Once installed, ensure you have a Kokoro TTS server running. Configure the connection by setting the KOKORO_API_URL variable in your .env file (e.g., KOKORO_API_URL=http://localhost:8880/v1/audio/speech). This ensures the agent knows exactly where to send text processing requests.
Use Cases
This skill is ideal for scenarios where auditory feedback is preferred over text, such as reading long-form articles, providing hands-free notifications, or enhancing roleplay interactions. It is perfect for users who want to "hear" their AI assistant, turning a standard text chatbot into a responsive voice-based companion. It supports multiple voice profiles, allowing you to customize the agent's persona to suit different contexts, such as professional reporting or casual conversation.
Example Prompts
- "Say hello to the team and let them know the project report is ready."
- "Please read back the meeting summary using the professional voice settings."
- "Send me a voice message explaining the current weather forecast for today."
Tips & Limitations
To get the best performance, experiment with the available voice profiles such as af_heart or am_adam. Note that the speed parameter allows for adjustment between 0.25 and 4.0; use slower speeds for accessibility or higher speeds for dense information retrieval. Ensure your network is stable if you are using a remote API, as audio generation requires consistent communication. Always monitor your local storage, as frequent TTS requests generate new MP3 files that may require periodic cleanup.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-edkief-kokoro-tts": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-write, external-api