ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

zai-tts

Text-to-speech conversion using GLM-TTS service via the `uvx zai-tts` command for generating audio from text. Use when (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, podcast, driving, cooking). (3) Using pre-cloned voices for speech.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/al-one/zai-tts
Or

What This Skill Does

The zai-tts skill provides a robust interface for OpenClaw agents to convert text into high-quality, natural-sounding audio files using the GLM-TTS service. It enables agents to move beyond text-only interaction, allowing for audible responses, content narration, and accessible communication. The skill supports advanced configuration, including adjustable speaking speeds, volume levels, and a selection of pre-cloned or system-default voices, making it a versatile tool for any audio-centric workflow.

Installation

To integrate zai-tts into your OpenClaw environment, run the following command in your terminal: clawhub install openclaw/skills/skills/al-one/zai-tts

Before executing, ensure you have configured your authentication credentials. Obtain your ZAI_AUDIO_USERID and ZAI_AUDIO_TOKEN by logging into audio.z.ai, opening your browser developer tools (F12), and inspecting the localStorage['auth-storage'] value. Export these as environment variables in your system to authorize the service requests.

Use Cases

This skill is ideal for scenarios requiring auditory feedback. Use it when users explicitly request voice output, or when accessibility needs necessitate screen-reading capabilities. It is perfect for multitasking workflows—such as generating podcasts from long-form articles, creating voiceovers for presentations, or providing spoken instructions for hands-busy activities like cooking or driving. By converting text to speech, your agent can deliver information in a more engaging and accessible format.

Example Prompts

  1. "Convert this article into a podcast episode and save it as episode_01.wav, using the Chloe voice for a professional tone."
  2. "Read the summary of the meeting notes out loud to me, but increase the speaking speed to 1.5 so I can listen quickly."
  3. "Create an audio guide for this text file, but use the Ethan voice and set the volume to 2 for better clarity in noisy environments."

Tips & Limitations

Always ensure your authentication tokens are valid; if the tool fails, refresh the localStorage data from the Zai portal. Use the uvx zai-tts -l command to list available voices periodically, as new custom-cloned voices will appear there once processed on the web platform. Note that high-volume processing may consume significant system resources or API usage limits, so batch your text requests where possible to maintain efficiency. The skill relies on external network connectivity to the GLM-TTS service, so ensure your firewall allows outgoing requests to the audio.z.ai endpoints.

Metadata

Author@al-one
Stars4473
Views0
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-al-one-zai-tts": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#tts#audio#speech#accessibility#narration
Safety Score: 4/5

Flags: file-write, file-read, external-api