elevenlabs-speech
Text-to-Speech and Speech-to-Text using ElevenLabs AI. Use when the user wants to convert text to speech, transcribe voice messages, or work with voice in multiple languages. Supports high-quality AI voices and accurate transcription.
Why use this skill?
Integrate ElevenLabs TTS and STT into OpenClaw. Convert text to natural voice, transcribe audio messages, and add professional speech capabilities to your AI agent.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/jeffpignataro/miranda-elevenlabs-speechWhat This Skill Does
The elevenlabs-speech skill provides a comprehensive voice-processing engine for the OpenClaw AI agent, bridging the gap between text-based AI processing and high-fidelity human speech. By leveraging the industry-leading ElevenLabs API, this skill allows your agent to handle both Text-to-Speech (TTS) for natural voice generation and Speech-to-Text (STT) via Scribe for accurate audio transcription. Whether you need to voice your agent's responses for Telegram or transcribe incoming voice notes from a team, this skill offers robust controls over voice identity, emotional stability, and language support.
Installation
To integrate this capability into your environment, run the following command in your terminal:
clawhub install openclaw/skills/skills/jeffpignataro/miranda-elevenlabs-speech
Ensure you have your ElevenLabs API key ready. Export it as an environment variable ELEVENLABS_API_KEY or include it in your .env file to ensure the client can authenticate with the service automatically.
Use Cases
- Automated Communication: Convert agent text responses into natural-sounding voice files for seamless integration with messaging platforms like Telegram.
- Voice Note Transcription: Automatically parse voice messages sent by users or team members, turning audio input into actionable text for the AI agent.
- Multilingual Support: Utilize the
eleven_multilingual_v2model to bridge communication barriers by synthesizing speech in various languages. - Accessibility: Provide auditory feedback for users who prefer listening to content over reading, improving overall agent accessibility.
Example Prompts
- "Convert my last message to an audio file using the Rachel voice and send it to me as a voice note on Telegram."
- "Transcribe this audio file located at /downloads/voice_note.ogg and summarize the key action items for me."
- "Say 'System update complete' using an authoritative voice model like Arnold."
Tips & Limitations
To get the best results, experiment with the stability and similarity_boost settings. Lower stability values are better for emotional, expressive speech, while higher values ensure a consistent, professional tone. Note that the quality of transcription via Scribe is dependent on audio clarity; background noise can impact the accuracy of STT results. Always monitor your ElevenLabs usage, as high-quality speech generation consumes characters/credits according to your specific subscription plan.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-jeffpignataro-miranda-elevenlabs-speech": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, external-api