mm-easy-voice
Simple text-to-speech skill using MiniMax Voice API. Converts text to audio with customizable voice selection. Use for generating speech audio from text.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/blue-coconut/mm-easy-voiceWhat This Skill Does
The mm-easy-voice skill provides a robust interface for the MiniMax Voice API, enabling OpenClaw agents to perform high-quality text-to-speech (TTS) conversion. It transforms plain text inputs into natural, expressive audio files. The skill supports a wide range of voices, allows for custom pause insertion for natural cadence, and includes advanced features like voice cloning from samples and voice design based on descriptive prompts. It is designed for seamless integration into automation workflows where audio output is required, such as creating voice-overs, automated accessibility features, or dynamic media generation.
Installation
To install this skill, use the following command in your terminal: clawhub install openclaw/skills/skills/blue-coconut/mm-easy-voice
Ensure you have configured your environment by checking python check_environment.py and setting the MINIMAX_VOICE_API_KEY environment variable. It is recommended to have FFmpeg installed for advanced audio manipulation.
Use Cases
- Content Creation: Generate voice-overs for video projects or social media content directly from text scripts.
- Accessibility: Convert written reports, articles, or documentation into high-quality audio files for screen readers or audio-only consumption.
- Virtual Assistants: Power the voice interface of custom agents that need to respond to user queries with human-like, emotional speech.
- Multimedia Automation: Automate the assembly of audiobooks or podcasts by concatenating multiple generated audio clips into a final file.
Example Prompts
- "Generate an audio file titled 'intro.mp3' using the 'female-shaonv' voice that says 'Welcome to the presentation, please hold for a moment' with a 2-second pause before the final word."
- "List all available voices so I can choose a professional tone for my corporate announcement."
- "Clone the voice from 'interview.mp3' and save it as 'custom-agent-voice' for future use."
Tips & Limitations
- Text Limits: Each request is limited to 10,000 characters. Break longer scripts into smaller segments for better performance.
- Custom Pauses: Improve natural flow by inserting tags like
<#1.5#>to force specific pauses where appropriate. - Voice Selection: Always check
reference/voice_catalog.mdto ensure you are selecting the voice best suited for your target audience and emotional context. - Emotion: The speech-2.8 models automatically detect and apply emotional nuance, so ensure your written text clearly conveys the intended tone.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-blue-coconut-mm-easy-voice": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, external-api, code-execution
Related Skills
mm-voice-maker
Enables voice synthesis, voice cloning, voice design, and audio post-processing using MiniMax Voice API and FFmpeg. Use when converting text to speech, creating custom voices, or processing/merging audio.
mm-music-maker
Create music with MiniMax music models (e.g., music-2.5). Use when generating songs or instrumental tracks from lyrics and style prompts, or when integrating MiniMax Music Generation API into scripts.
mm-music-expert
Create music with MiniMax music models (music-2.5+, music-2.5). Use when generating songs, instrumental tracks, or chanting from lyrics and style prompts via MiniMax Music Generation API. Guides music novices through an interactive workflow to produce professional-quality music.