ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

lh-edge-tts

Text-to-speech conversion using Python edge-tts for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.

Why use this skill?

Convert text to high-quality neural speech with the OpenClaw lh-edge-tts skill. Supports multiple languages, custom pitch, speed, and subtitle generation.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/liuhedev/lh-edge-tts
Or

What This Skill Does

The lh-edge-tts skill leverages the power of Microsoft Edge's high-quality neural text-to-speech engine to convert text into natural-sounding audio. It provides an interface for OpenClaw users to generate audio output programmatically. By integrating directly with the edge-tts Python library, it supports a vast library of neural voices across multiple languages, precise control over speech rates, pitch, and volume, as well as the ability to generate synchronized subtitle files (SRT or VTT). This skill bridges the gap between text-based AI processing and human-centric auditory communication.

Installation

To integrate this skill into your agent, use the OpenClaw command-line interface. Execute the following command in your terminal:

clawhub install openclaw/skills/skills/liuhedev/lh-edge-tts

Ensure that your environment has Python 3 installed and the required dependencies mentioned in the source repository are met. Once installed, the agent will recognize the 'tts' trigger and start processing requests automatically.

Use Cases

This skill is ideal for several scenarios:

  1. Accessibility: Converting long articles, documents, or chat responses into audio for visually impaired users or those who prefer auditory learning.
  2. Multitasking: Enabling users to consume AI-generated information while driving, cooking, or exercising without needing to look at a screen.
  3. Content Creation: Generating voiceovers for video projects or presentations by outputting high-quality audio files alongside subtitle files.
  4. Language Learning: Using natural-sounding neural voices to practice listening comprehension in various languages.

Example Prompts

  1. "tts Read the latest technical documentation summary to me using the English Aria voice at a slightly slower speed."
  2. "tts Convert this story into an audio file using the Chinese Yunyang voice and save it to my downloads folder."
  3. "tts Please read back the summary of the meeting notes, but use a faster speed so I can review it quickly while I drive."

Tips & Limitations

  • Rate Tuning: Use the percentage-based syntax (e.g., +20%) to adjust speed. Avoid going over 50% as clarity may degrade.
  • Voice Selection: Always use the --list-voices command to see the latest available neural models, as Microsoft updates their voice library periodically.
  • Network Dependence: This skill requires an active internet connection to communicate with the edge-tts service endpoints; offline usage is not currently supported.
  • Performance: While latency is generally low, very long text inputs should be processed in segments to ensure stability.

Metadata

Author@liuhedev
Stars1601
Views1
Updated2026-02-27
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-liuhedev-lh-edge-tts": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#tts#audio#speech#accessibility#voiceover
Safety Score: 4/5

Flags: file-write, file-read, external-api