ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

elevenlabs-speech

Text-to-Speech and Speech-to-Text using ElevenLabs AI. Use when the user wants to convert text to speech, transcribe voice messages, or work with voice in multiple languages. Supports high-quality AI voices and accurate transcription.

Why use this skill?

Integrate ElevenLabs TTS and STT into OpenClaw. Convert text to natural voice, transcribe audio messages, and add professional speech capabilities to your AI agent.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/jeffpignataro/miranda-elevenlabs-speech
Or

What This Skill Does

The elevenlabs-speech skill provides a comprehensive voice-processing engine for the OpenClaw AI agent, bridging the gap between text-based AI processing and high-fidelity human speech. By leveraging the industry-leading ElevenLabs API, this skill allows your agent to handle both Text-to-Speech (TTS) for natural voice generation and Speech-to-Text (STT) via Scribe for accurate audio transcription. Whether you need to voice your agent's responses for Telegram or transcribe incoming voice notes from a team, this skill offers robust controls over voice identity, emotional stability, and language support.

Installation

To integrate this capability into your environment, run the following command in your terminal:

clawhub install openclaw/skills/skills/jeffpignataro/miranda-elevenlabs-speech

Ensure you have your ElevenLabs API key ready. Export it as an environment variable ELEVENLABS_API_KEY or include it in your .env file to ensure the client can authenticate with the service automatically.

Use Cases

  • Automated Communication: Convert agent text responses into natural-sounding voice files for seamless integration with messaging platforms like Telegram.
  • Voice Note Transcription: Automatically parse voice messages sent by users or team members, turning audio input into actionable text for the AI agent.
  • Multilingual Support: Utilize the eleven_multilingual_v2 model to bridge communication barriers by synthesizing speech in various languages.
  • Accessibility: Provide auditory feedback for users who prefer listening to content over reading, improving overall agent accessibility.

Example Prompts

  1. "Convert my last message to an audio file using the Rachel voice and send it to me as a voice note on Telegram."
  2. "Transcribe this audio file located at /downloads/voice_note.ogg and summarize the key action items for me."
  3. "Say 'System update complete' using an authoritative voice model like Arnold."

Tips & Limitations

To get the best results, experiment with the stability and similarity_boost settings. Lower stability values are better for emotional, expressive speech, while higher values ensure a consistent, professional tone. Note that the quality of transcription via Scribe is dependent on audio clarity; background noise can impact the accuracy of STT results. Always monitor your ElevenLabs usage, as high-quality speech generation consumes characters/credits according to your specific subscription plan.

Metadata

Stars1947
Views1
Updated2026-03-04
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-jeffpignataro-miranda-elevenlabs-speech": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#tts#speech-to-text#audio#elevenlabs#voice
Safety Score: 4/5

Flags: file-write, file-read, external-api