MiniMax 多人对话语音合成
根据用户需求生成多人对话,为每个角色匹配音色进行语音合成,输出完整长音频和分段音频,并生成 HTML 展示页面。
Why use this skill?
Generate professional multi-character dialogue scripts and synthesize them into high-quality audio with unique voice matching and web page exports.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/hexiaochun/sutui-minimax-ttsWhat This Skill Does
The MiniMax 多人对话语音合成 skill is a sophisticated AI agent module designed to transform text-based creative writing into high-quality, professional-grade audio productions. By integrating advanced natural language generation with high-fidelity speech synthesis models, this skill handles the end-to-end process of multi-character dialogue creation. It automatically structures narratives, analyzes character profiles to assign distinct voice IDs, performs intelligent text-to-speech synthesis with expressive emotional markers, and generates a cohesive, ready-to-use HTML presentation.
Installation
To install this skill, use the OpenClaw CLI in your terminal:
clawhub install openclaw/skills/skills/hexiaochun/sutui-minimax-tts
Ensure you have the necessary environment dependencies installed, including pydub and FFmpeg on your system to support the audio merging functionalities provided by the automation scripts.
Use Cases
This skill is perfect for content creators, developers, and educators. Use it to:
- Educational Content: Create interactive audio-based learning scripts or historical reenactments.
- Audio Dramas & Podcasts: Rapidly prototype or produce multi-character scripted audio content with varied emotional depth.
- Marketing & User Testing: Generate lifelike user personas for conversational UI testing or immersive brand storytelling.
- Accessibility: Convert long-form text or documents into structured, multi-voice audiobooks that are easier to consume.
Example Prompts
- "生成一段关于程序员和产品经理在星巴克争论需求文档的对话,两人风格要截然不同,最后生成合成音频页面。"
- "创建一个轻松幽默的访谈,主持人是女性,嘉宾是一位经验丰富的老中医,要求对话包含丰富的情绪表达,并生成 HTML 网页展示。"
- "编写一段严肃的商业谈判对话剧本,角色为两位成熟商务人士,语速要求中等,并帮我处理好对应的音频合成任务。"
Tips & Limitations
- Quality Control: Always review the suggested voice mappings after the tool call to ensure the tone matches your character archetype.
- Text Preprocessing: The skill includes advanced emotional markup (like breath or laughter). Ensure your input dialogue allows for these markers to significantly improve the realism of the output.
- Limitations: The generation speed depends on the total word count of the script and server availability. Extremely long scripts (over 2000 words) should be broken into multiple chapters for best results. Always verify the output directory (
dialogue_output/) after the process finishes to ensure all segments have been rendered correctly.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-hexiaochun-sutui-minimax-tts": {
"enabled": true,
"auto_update": true
}
}
}Tags
Flags: file-write, file-read, external-api, code-execution
Related Skills
narrator-ai-cli
Create AI-narrated film/drama commentary videos via CLI. Two workflow paths (Original & Adapted narration), 100+ movies, 146 BGM tracks, 63 dubbing voices in 11 languages, 90+ narration templates. Use when creating narration videos, film commentary, short drama dubbing, or video production.
podcast-agent
Search articles on any topic, generate a two-host dialogue script, and synthesize podcast audio via TTS. Turn long reads into listenable content.
video-producer
短视频一键生成技能 v2.2。调用video-director进行画面规划,然后生成AI素材、TTS配音、视频渲染,输出完整MP4。
ressemble
Text-to-Speech and Speech-to-Text integration using Resemble AI HTTP API.
AB-Agents-Vision-MiniMax
👁️ Image analysis via MiniMax VL API. Describe images, extract text from screenshots, analyze photos. Requires MiniMax Token Plan API key (free tier available).