What This Skill Does

The TTS (Text-to-Speech) skill for OpenClaw empowers your AI agent to bridge the gap between digital text and human-like auditory communication. By integrating high-quality synthesis providers, this skill allows the agent to generate audio files (MP3s) from any string of text. It primarily supports Hume AI for premium, expressive vocal synthesis, with a fallback option for OpenAI. Once generated, the agent provides an absolute path to the audio file, which can then be transmitted directly to the user's interface, creating a natural, conversational experience.

Installation

To integrate this capability into your agent, use the OpenClaw command-line interface. Run the following command in your terminal:

clawhub install openclaw/skills/skills/amstko/tts

Ensure that you have your environment variables properly configured before execution. You will need a valid HUME_API_KEY and HUME_SECRET_KEY for the primary service, or an OPENAI_API_KEY for legacy support. These keys should be stored securely in your environment variables to allow the scripts to access the required API endpoints.

Use Cases

This skill is ideal for scenarios where auditory feedback enhances the user experience. Use it when:

Users request an audio explanation of a complex topic while driving or commuting.
You want to provide a "viva voce" or "out loud" reading of a generated document or code snippet.
The agent needs to deliver empathetic or emotionally resonant messages that text alone cannot convey.
You are building accessibility-focused features for users with visual impairments.

Example Prompts

"Could you read that summary out loud to me?"
"I need a voice reply for this notification, please use the TTS skill."
"Please generate an audio file of this response so I can listen to it vive voix."

Tips & Limitations

When using the TTS skill, remember that generation depends on external API availability and latency. For the best experience, we recommend using the Hume AI voice 9e1f9e4f-691a-4bb0-b87c-e306a4c838ef as it is optimized for natural intonation. Always ensure the message tool is invoked after script execution to properly surface the resulting media to the user. Be mindful of API rate limits and associated costs when generating high volumes of audio content. Additionally, ensure your system has write permissions for the designated output directory, as the skill writes audio files locally before transit.

tts

Install via CLI (Recommended)

What This Skill Does

Installation

Use Cases

Example Prompts

Tips & Limitations

Metadata

Tags(AI)