ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 5/5

voice-ai-tts

High-quality voice synthesis with 9 personas, 11 languages, and streaming using Voice.ai API.

Why use this skill?

Integrate Voice.ai into OpenClaw for professional text-to-speech. Features 9 personas, 11 languages, and real-time streaming audio capabilities.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/gizmogremlin/openclaw-skill-voice-ai-voices
Or

What This Skill Does

The voice-ai-tts skill for OpenClaw provides a robust, high-quality text-to-speech synthesis engine powered by the Voice.ai API. It enables users to convert text into lifelike speech directly within their terminal or OpenClaw environment. By leveraging a library of 9 distinct, carefully curated personas and support for 11 different languages, the skill is designed for versatility. It supports both standard file-based synthesis and real-time streaming, allowing for lower latency output when generating longer passages of text. Because it is pre-integrated into the OpenClaw framework, users can interact with the synthesis engine using simple chat-based commands, removing the need for complex API handling or external audio manipulation tools.

Installation

Installation is streamlined and does not require external NPM dependencies or heavy configuration. Because the skill is bundled with its own Node.js SDK and CLI tools, it is ready to use immediately upon installation via the ClawHub repository. Users only need to set a single environment variable, VOICE_AI_API_KEY, which is obtained from the official Voice.ai dashboard. Once the key is configured, the skill automatically registers its commands with OpenClaw, making the /tts and /voices commands available for immediate use.

Use Cases

This skill is ideal for developers, content creators, and accessibility-focused users. You can use it to generate voice-overs for video projects, provide auditory feedback for automated scripts, or create conversational AI agents that need a human-like voice. It is particularly effective for real-time applications where streaming output allows for immediate auditory feedback, such as reading back search results or providing summaries during a long-running task.

Example Prompts

  1. "/tts --voice ellie Good evening, the system update is complete and all services are running normally."
  2. "/tts --stream This is an experiment to see how quickly the audio can start playing while the remainder of the long text is being processed by the backend."
  3. "/voices"

Metadata

Stars2387
Views0
Updated2026-03-09
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-gizmogremlin-openclaw-skill-voice-ai-voices": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags

#tts#voice#speech#voice-ai#audio#streaming#multilingual
Safety Score: 5/5

Flags: network-access, file-read, file-write, external-api