Community Verified media Safety 4/5

sag

ElevenLabs text-to-speech with mac-style say UX.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/openclaw/skills/sag

Download Source Code (.zip)

What This Skill Does

The sag skill is an advanced text-to-speech (TTS) utility that leverages ElevenLabs for high-quality voice generation, mimicking a macOS-style say command. It allows users to convert text into speech with various customization options, including different voices, models, and even expressive audio tags for nuanced delivery. This skill is ideal for creating natural-sounding voiceovers, enhancing accessibility, or adding dynamic audio elements to AI agent responses.

Installation

To install the sag skill, use the following command:

clawhub install openclaw/openclaw/skills/sag

This will add the sag skill to your OpenClaw environment. You will need to provide your ElevenLabs API key, preferably through the ELEVENLABS_API_KEY environment variable, or alternatively via SAG_API_KEY.

Use Cases

Dynamic AI Responses: Generate spoken replies for AI agents, making interactions more engaging.
Content Creation: Quickly create voiceovers for videos, podcasts, or presentations.
Accessibility: Provide auditory feedback for users who prefer spoken information.
Prototyping: Test different voice styles and emotional tones for applications.
Personalized Audio: Create custom audio messages with specific voices and inflections.

Example Prompts

sag "Please read this document aloud in a calm voice."
sag speak -v "Crazy Scientist" "Initiate the experiment! [excited] It's time!"
sag voices

Tips & Limitations

API Key: Ensure your ELEVENLABS_API_KEY or SAG_API_KEY is correctly set.
Voice Customization: Use sag -v <voice_name_or_id> to select a voice. You can list available voices with sag voices.
Model Selection: Choose from different ElevenLabs models like eleven_v3 (default, expressive), eleven_multilingual_v2 (stable), or eleven_flash_v2_5 (fast) using the --model flag.
Pronunciation: For complex words or names, use respelling (e.g., "key-note") or hyphens. The --normalize option (default auto) helps with numbers, units, and URLs. Use --lang to bias language processing.
Expressive Tags: For eleven_v3, use tags like [whispers], [shouts], [sings], [laughs], [sighs], [sarcastic], [curious], [excited], [crying], [mischievously] for emotional delivery. Use [pause], [short pause], [long pause] for timing.
SSML Support: Older models (v2, v2.5) support SSML <break> tags. v3 does not directly support SSML <break> but uses the custom pause tags.
Chat Responses: For voice replies in chat, generate an MP3 file using sag -v <voice> -o <output_path.mp3> "<message>" and then reference it as MEDIA:<output_path.mp3>.
Default Clawd Voice: Use -v Clawd or the ID lj2rcrvANS3gaWWnczSX for the default Clawd voice character.

Read Full Documentation on GitHub

Metadata

Author@openclaw

Stars369848

Updated2026-05-08

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-openclaw-sag": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#tts#elevenlabs#voice#audio

Safety Score: 4/5

Flags: external-api, network-access, file-write

Related Skills

apple-notes

Create, view, edit, delete, search, move, or export Apple Notes via the memo CLI on macOS.

openclaw 370199

sherpa-onnx-tts

Local text-to-speech via sherpa-onnx (offline, no cloud)

openclaw 370199

goplaces

Query Google Places for text search, place details, resolve, reviews, or scriptable JSON via goplaces.

openclaw 370199

skill-creator

Create, edit, improve, tidy, review, audit, or restructure AgentSkills and SKILL.md files.

openclaw 370199

video-frames

Extract frames or short clips from videos using ffmpeg.

openclaw 370199