voice-tts
使用 edge-tts 生成高质量中文语音消息并发送。当用户要求发语音、语音回复、TTS、文字转语音、语音播报、语音消息时使用。支持多种中文声音(男声/女声/方言),可调节语速音调,适用于飞书/Telegram/Discord 等渠道的语音消息发送。
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/binbin1213/ms-voice-ttsWhat This Skill Does
The voice-tts skill integrates Microsoft's edge-tts engine into the OpenClaw ecosystem, enabling the agent to synthesize high-quality, human-like Chinese speech from text. This skill transforms the agent from a purely text-based interface into a multimodal assistant capable of delivering audio responses. Whether you are using Feishu, Telegram, or Discord, this skill processes text input to generate opus-format audio files, which are then transmitted as native voice messages. It supports a diverse range of voices, including various gender-specific neural voices and regional accents like Northeast Mandarin and Taiwanese Mandarin, providing a professional and localized communication experience.
Installation
To use this skill, ensure you have the edge-tts Python library installed. We strongly recommend using pipx to manage the installation to keep your system Python environment clean and avoid dependency conflicts.
- Install via
pipx install edge-tts(orpip install --user edge-ttsfor Linux systems where pipx is unavailable). - Verify the installation by running
edge-tts --list-voicesin your terminal. - Ensure the script directory is accessible by the OpenClaw agent to allow for the automated triggering of
tts.sh.
Use Cases
- Proactive Notifications: Automatically send voice reminders for meetings or project deadlines in team chat channels like Feishu.
- Customer Service Automation: Provide personalized, warm human-sounding voice responses in Telegram or Discord customer support threads.
- Accessibility Enhancements: Convert long-form text reports into audio briefs for users who prefer listening to content while on the move.
- Regional Localization: Use specific dialect voices (e.g.,
zh-TW-HsiaoChenNeural) to better engage with specific demographic user bases.
Example Prompts
- "Send a voice message to the team channel saying 'The project review meeting will start in 5 minutes, please be prepared.'"
- "Reply to this user on Telegram with an audio message: 'Thank you for your feedback, we have received your request and will address it shortly.'"
- "Use the Xiaoxiao voice to announce this text in Discord: 'Welcome everyone, today's focus is on optimizing our AI agent workflows.'"
Tips & Limitations
- Optimization: Keep your input text between 50-300 characters for the best balance of generation speed and audio quality.
- File Management: Note that the skill generates files in
~/.openclaw/media/. Regularly clean up these directories to prevent disk bloat if you generate high volumes of audio. - Network Dependency: While edge-tts is locally executed, it requires an active internet connection to communicate with Microsoft's speech services during the generation phase.
- Latency: Generation typically takes 1-3 seconds. Plan your application flow to account for this short processing delay before sending the message.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-binbin1213-ms-voice-tts": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, external-api
Related Skills
skill-reviewer
审核/审查 Skill 代码质量的专业工具。当用户说"检查 skill"、"审核 skill"、"review {名称} skill"、"skill 写得怎么样"、"帮我看看这个 skill 有什么问题"时使用。依据 Anthropic 官方指南进行结构验证、YAML 前置信息检查、描述质量评估、指令完整性审查,并输出详细的问题报告和改进建议。
daily-news-brief
聚合并整理多源新闻,按科技/财经/AI/智能体分类排序,生成 Markdown 摘要并可定时执行。当用户提到"新闻"、"今日新闻"、"整理新闻"、"科技新闻"、"财经新闻"、"AI 新闻"、"智能体新闻"、"聚合新闻"或需要定时获取新闻摘要时使用。