ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 5/5

voice-ai-tts

High-quality voice synthesis with 9 personas, 11 languages, and streaming using Voice.ai API.

Why use this skill?

Integrate Voice.ai into OpenClaw for high-quality, multilingual text-to-speech synthesis with 9 personas and real-time streaming capabilities.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/gizmogremlin/voice-ai-voices
Or

What This Skill Does

The voice-ai-tts skill provides a high-fidelity interface for the Voice.ai text-to-speech engine directly within the OpenClaw agent ecosystem. It allows users to synthesize natural-sounding speech from text using a selection of 9 specialized voice personas and support for 11 distinct languages. This skill is designed for seamless integration, offering both standard file-based synthesis for offline use and a real-time streaming mode that delivers audio chunks as they are generated, minimizing latency for conversational interactions. By utilizing the official Voice.ai API, it ensures high-quality acoustic performance suitable for creative projects, accessibility tools, or automated notifications.

Installation

Installation of this skill is streamlined through the OpenClaw ecosystem. Because the skill bundles its own Node.js SDK and CLI dependencies, there is no manual npm install process required. Simply run the following command in your terminal to initialize the skill in your environment:

clawhub install openclaw/skills/skills/gizmogremlin/voice-ai-voices

Once installed, ensure you have obtained your API credentials from the Voice.ai dashboard and set them as an environment variable in your shell configuration to authorize requests:

export VOICE_AI_API_KEY="your-api-key-here"

Use Cases

This skill is versatile and can be applied to several professional and creative workflows. Use it to generate natural-sounding voice-overs for video content or podcasts directly through your agent interface. It is perfect for enhancing application accessibility by converting documentation or status updates into audio formats. Developers can use the streaming mode for real-time interactive avatars or AI assistants that require instantaneous audio feedback. Additionally, it serves as an excellent prototyping tool for game developers who need placeholder voice lines generated on-the-fly.

Example Prompts

  1. "/tts --voice ellie Could you please read the meeting notes aloud for me?"
  2. "/tts --stream Welcome to our demo; I am using the streaming feature to generate this audio in real-time."
  3. "/voices"

Tips & Limitations

To get the most out of voice-ai-tts, experiment with the available voice personas to match the tone of your content—some are optimized for clarity while others focus on emotional range. For long-form text, always use the --stream flag to reduce the wait time before audio playback begins. Note that this skill requires an active internet connection to communicate with the Voice.ai servers, and generation performance may fluctuate based on network latency. Since the skill writes to the local file system, ensure your directory permissions allow creating new .mp3 files.

Metadata

Stars2387
Views5
Updated2026-03-09
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-gizmogremlin-voice-ai-voices": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags

#tts#voice#speech#voice-ai#audio#streaming#multilingual
Safety Score: 5/5

Flags: network-access, file-write, file-read, external-api

Related Skills

narrator-ai-cli

Create AI-narrated film/drama commentary videos via CLI. Two workflow paths (Original & Adapted narration), 100+ movies, 146 BGM tracks, 63 dubbing voices in 11 languages, 90+ narration templates. Use when creating narration videos, film commentary, short drama dubbing, or video production.

4myhime 4473

narrator-ai-cli

AI电影解说视频自动生成技能(AI解说大师 CLI Skill)。当用户需要创建电影解说视频、短剧解说、影视二创、AI配音旁白视频、film commentary、video narration、drama dubbing、movie narration时触发。内置93部电影素材、146首BGM、63种配音音色(11种语言)、90+解说模板。通过narrator-ai-cli命令行工具实现:搜片选片→选择模板→选BGM→选配音→生成文案→合成视频的全流程自动化。CLI client for Narrator AI (AI解说大师) video narration API. Use when user needs to create AI narration videos, manage narration tasks, browse dubbing/BGM/material resources, or automate video production.

4myhime 4473

podcast-agent

Search articles on any topic, generate a two-host dialogue script, and synthesize podcast audio via TTS. Turn long reads into listenable content.

besty0121 4473

ym-mediatoolkit

流式视频处理工具集 - 压缩、封面提取、音频转换,无需下载完整视频

370299455cx-web 4473

video-producer

短视频一键生成技能 v2.2。调用video-director进行画面规划,然后生成AI素材、TTS配音、视频渲染,输出完整MP4。

a1024708231 4473