ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

sag

ElevenLabs text-to-speech with mac-style say UX.

Why use this skill?

Enhance OpenClaw with sag, a high-quality ElevenLabs TTS tool. Easily generate expressive audio, use emotional tags, and create custom character voices via CLI.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/steipete/sag
Or

What This Skill Does

The sag skill is an advanced text-to-speech (TTS) interface for OpenClaw that bridges the ElevenLabs API with the simplicity of macOS-style terminal commands. It allows agents to generate high-quality, expressive human-like audio directly from the command line. Unlike basic speech generators, sag is optimized for nuance, supporting specialized audio tags to convey emotion, laughter, whispering, and structural pauses, making it an essential tool for character-driven AI interactions.

Installation

To integrate this skill into your environment, use the OpenClaw Hub CLI: clawhub install openclaw/skills/skills/steipete/sag

Ensure you have your ElevenLabs API key ready. You can configure it by setting the environment variable ELEVENLABS_API_KEY or SAG_API_KEY. For default voice settings, set ELEVENLABS_VOICE_ID or SAG_VOICE_ID to your preferred voice identifier.

Use Cases

This skill is perfect for voice-enabled AI assistants, interactive fiction, accessibility features in terminal applications, and dynamic audio feedback in automated scripts. It is particularly effective when the agent needs to adopt a specific persona—such as a "crazy scientist" or a "calm narrator"—using the provided mood tags.

Example Prompts

  1. "sag -v Clawd 'Hello human, I have completed the analysis of your request.'"
  2. "sag '[excited] I just finished the code! [short pause] check this out.'"
  3. "sag -v 'Roger' 'The weather in Tokyo is currently 22 degrees and sunny.'"

Tips & Limitations

For optimal results, follow these best practices:

  • Normalization: Use --normalize auto for most standard inputs, but disable it if you find it is misinterpreting specific acronyms or project names.
  • Language Bias: When generating non-English text, use the --lang flag to improve pronunciation accuracy.
  • Audio Tags: For eleven_v3, avoid traditional SSML break tags. Use the native [pause], [short pause], or [long pause] markers instead.
  • Pronunciation: If a word is consistently mispronounced, try respelling it phonetically (e.g., 'key-note') or adding hyphens to force correct syllable pacing.
  • Limitations: Note that [phoneme] tags are not supported directly by this CLI wrapper. Always test long responses with a shorter snippet first to ensure the expressive tags interact correctly with the chosen voice model.

Metadata

Author@steipete
Stars982
Views0
Updated2026-02-14
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-steipete-sag": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#tts#audio#elevenlabs#voice#cli
Safety Score: 4/5

Flags: external-api, file-write