lh-edge-tts
Text-to-speech conversion using Python edge-tts for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.
Why use this skill?
Convert text to high-quality neural speech with the OpenClaw lh-edge-tts skill. Supports multiple languages, custom pitch, speed, and subtitle generation.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/liuhedev/lh-edge-ttsWhat This Skill Does
The lh-edge-tts skill leverages the power of Microsoft Edge's high-quality neural text-to-speech engine to convert text into natural-sounding audio. It provides an interface for OpenClaw users to generate audio output programmatically. By integrating directly with the edge-tts Python library, it supports a vast library of neural voices across multiple languages, precise control over speech rates, pitch, and volume, as well as the ability to generate synchronized subtitle files (SRT or VTT). This skill bridges the gap between text-based AI processing and human-centric auditory communication.
Installation
To integrate this skill into your agent, use the OpenClaw command-line interface. Execute the following command in your terminal:
clawhub install openclaw/skills/skills/liuhedev/lh-edge-tts
Ensure that your environment has Python 3 installed and the required dependencies mentioned in the source repository are met. Once installed, the agent will recognize the 'tts' trigger and start processing requests automatically.
Use Cases
This skill is ideal for several scenarios:
- Accessibility: Converting long articles, documents, or chat responses into audio for visually impaired users or those who prefer auditory learning.
- Multitasking: Enabling users to consume AI-generated information while driving, cooking, or exercising without needing to look at a screen.
- Content Creation: Generating voiceovers for video projects or presentations by outputting high-quality audio files alongside subtitle files.
- Language Learning: Using natural-sounding neural voices to practice listening comprehension in various languages.
Example Prompts
- "tts Read the latest technical documentation summary to me using the English Aria voice at a slightly slower speed."
- "tts Convert this story into an audio file using the Chinese Yunyang voice and save it to my downloads folder."
- "tts Please read back the summary of the meeting notes, but use a faster speed so I can review it quickly while I drive."
Tips & Limitations
- Rate Tuning: Use the percentage-based syntax (e.g., +20%) to adjust speed. Avoid going over 50% as clarity may degrade.
- Voice Selection: Always use the
--list-voicescommand to see the latest available neural models, as Microsoft updates their voice library periodically. - Network Dependence: This skill requires an active internet connection to communicate with the edge-tts service endpoints; offline usage is not currently supported.
- Performance: While latency is generally low, very long text inputs should be processed in segments to ensure stability.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-liuhedev-lh-edge-tts": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, external-api