Official Verified media Safety 4/5

mm-easy-voice

Simple text-to-speech skill using MiniMax Voice API. Converts text to audio with customizable voice selection. Use for generating speech audio from text.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/blue-coconut/mm-easy-voice

Download Source Code (.zip)

What This Skill Does

The mm-easy-voice skill provides a robust interface for the MiniMax Voice API, enabling OpenClaw agents to perform high-quality text-to-speech (TTS) conversion. It transforms plain text inputs into natural, expressive audio files. The skill supports a wide range of voices, allows for custom pause insertion for natural cadence, and includes advanced features like voice cloning from samples and voice design based on descriptive prompts. It is designed for seamless integration into automation workflows where audio output is required, such as creating voice-overs, automated accessibility features, or dynamic media generation.

Installation

To install this skill, use the following command in your terminal: clawhub install openclaw/skills/skills/blue-coconut/mm-easy-voice

Ensure you have configured your environment by checking python check_environment.py and setting the MINIMAX_VOICE_API_KEY environment variable. It is recommended to have FFmpeg installed for advanced audio manipulation.

Use Cases

Content Creation: Generate voice-overs for video projects or social media content directly from text scripts.
Accessibility: Convert written reports, articles, or documentation into high-quality audio files for screen readers or audio-only consumption.
Virtual Assistants: Power the voice interface of custom agents that need to respond to user queries with human-like, emotional speech.
Multimedia Automation: Automate the assembly of audiobooks or podcasts by concatenating multiple generated audio clips into a final file.

Example Prompts

"Generate an audio file titled 'intro.mp3' using the 'female-shaonv' voice that says 'Welcome to the presentation, please hold for a moment' with a 2-second pause before the final word."
"List all available voices so I can choose a professional tone for my corporate announcement."
"Clone the voice from 'interview.mp3' and save it as 'custom-agent-voice' for future use."

Tips & Limitations

Text Limits: Each request is limited to 10,000 characters. Break longer scripts into smaller segments for better performance.
Custom Pauses: Improve natural flow by inserting tags like <#1.5#> to force specific pauses where appropriate.
Voice Selection: Always check reference/voice_catalog.md to ensure you are selecting the voice best suited for your target audience and emotional context.
Emotion: The speech-2.8 models automatically detect and apply emotional nuance, so ensure your written text clearly conveys the intended tone.

Read Full Documentation on GitHub

Metadata

Author@blue-coconut

Stars4473

Updated2026-05-01

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-blue-coconut-mm-easy-voice": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#tts#audio#voice-cloning#media#automation

Safety Score: 4/5

Flags: file-write, file-read, external-api, code-execution

Related Skills

mm-voice-maker

Enables voice synthesis, voice cloning, voice design, and audio post-processing using MiniMax Voice API and FFmpeg. Use when converting text to speech, creating custom voices, or processing/merging audio.

blue-coconut 4473

mm-music-maker

Create music with MiniMax music models (e.g., music-2.5). Use when generating songs or instrumental tracks from lyrics and style prompts, or when integrating MiniMax Music Generation API into scripts.

blue-coconut 4473

mm-music-expert

Create music with MiniMax music models (music-2.5+, music-2.5). Use when generating songs, instrumental tracks, or chanting from lyrics and style prompts via MiniMax Music Generation API. Guides music novices through an interactive workflow to produce professional-quality music.

blue-coconut 4473