ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

Deepdub TTS

Generate speech audio using Deepdub and attach it as a MEDIA file (Telegram-compatible).

Why use this skill?

Easily convert text to natural-sounding speech in OpenClaw with the Deepdub TTS skill. Perfect for Telegram audio integration and accessibility.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/yuval-deepdub/deepdub-tts
Or

What This Skill Does

The Deepdub TTS skill empowers OpenClaw to transform plain text into high-quality, natural-sounding speech. By leveraging the advanced Deepdub API, this skill generates audio files that are automatically formatted as MEDIA attachments. This ensures seamless integration with messaging platforms like Telegram, allowing your agent to communicate in a more human-like, audible format. Whether you are generating responses for a conversational agent or producing dynamic content, this skill bridges the gap between text-based reasoning and voice communication.

Installation

To integrate this skill into your environment, ensure you have Python 3.9 or higher installed. Use the recommended package manager 'uv' to handle dependencies: uv pip install deepdub. Once dependencies are met, install the skill via the OpenClaw hub using the command: clawhub install openclaw/skills/skills/yuval-deepdub/deepdub-tts.

Before launching, you must configure the following environment variables: DEEPDUB_API_KEY and DEEPDUB_VOICE_PROMPT_ID. Optional configurations include setting DEEPDUB_LOCALE (defaulting to en-US) and OPENCLAW_MEDIA_DIR if you prefer a custom storage location for your generated audio files.

Use Cases

  • Automated Customer Support: Send personalized audio greetings or troubleshooting steps to users via Telegram.
  • Content Creation: Automatically convert research summaries or news articles into podcasts or audio clips.
  • Accessibility: Ensure your agent-based services remain inclusive by providing audio versions of text-based information for visually impaired users.
  • Interactive Storytelling: Build immersive roleplay scenarios where the agent speaks responses rather than simply typing them.

Example Prompts

  1. "Deepdub, please convert this message to audio: 'The system update is complete and all services are back online.'"
  2. "Read the following text using the default voice prompt: [Paste long text here]."
  3. "Send a voice note to the Telegram channel saying: 'Don't forget to review the project roadmap before our 3 PM meeting.'"

Tips & Limitations

  • Voice Quality: Performance heavily depends on the DEEPDUB_VOICE_PROMPT_ID. Experiment with different prompts to find the tone that best fits your agent's personality.
  • Cost: Be aware that frequent API calls to Deepdub may incur costs based on your subscription tier.
  • File Size: Audio files can become large; ensure your OPENCLAW_MEDIA_DIR has sufficient storage and implement cleanup scripts if your agent generates content at high volume.
  • Performance: Generation is dependent on network latency to the Deepdub API. For real-time applications, consider pre-generating static responses.

Metadata

Stars879
Views0
Updated2026-02-11
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-yuval-deepdub-deepdub-tts": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags

#tts#deepdub#audio#telegram
Safety Score: 4/5

Flags: external-api, file-write, network-access