sag
ElevenLabs text-to-speech with mac-style say UX.
Why use this skill?
Enhance OpenClaw with sag, a high-quality ElevenLabs TTS tool. Easily generate expressive audio, use emotional tags, and create custom character voices via CLI.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/steipete/sagWhat This Skill Does
The sag skill is an advanced text-to-speech (TTS) interface for OpenClaw that bridges the ElevenLabs API with the simplicity of macOS-style terminal commands. It allows agents to generate high-quality, expressive human-like audio directly from the command line. Unlike basic speech generators, sag is optimized for nuance, supporting specialized audio tags to convey emotion, laughter, whispering, and structural pauses, making it an essential tool for character-driven AI interactions.
Installation
To integrate this skill into your environment, use the OpenClaw Hub CLI:
clawhub install openclaw/skills/skills/steipete/sag
Ensure you have your ElevenLabs API key ready. You can configure it by setting the environment variable ELEVENLABS_API_KEY or SAG_API_KEY. For default voice settings, set ELEVENLABS_VOICE_ID or SAG_VOICE_ID to your preferred voice identifier.
Use Cases
This skill is perfect for voice-enabled AI assistants, interactive fiction, accessibility features in terminal applications, and dynamic audio feedback in automated scripts. It is particularly effective when the agent needs to adopt a specific persona—such as a "crazy scientist" or a "calm narrator"—using the provided mood tags.
Example Prompts
- "sag -v Clawd 'Hello human, I have completed the analysis of your request.'"
- "sag '[excited] I just finished the code! [short pause] check this out.'"
- "sag -v 'Roger' 'The weather in Tokyo is currently 22 degrees and sunny.'"
Tips & Limitations
For optimal results, follow these best practices:
- Normalization: Use
--normalize autofor most standard inputs, but disable it if you find it is misinterpreting specific acronyms or project names. - Language Bias: When generating non-English text, use the
--langflag to improve pronunciation accuracy. - Audio Tags: For
eleven_v3, avoid traditional SSML break tags. Use the native[pause],[short pause], or[long pause]markers instead. - Pronunciation: If a word is consistently mispronounced, try respelling it phonetically (e.g., 'key-note') or adding hyphens to force correct syllable pacing.
- Limitations: Note that
[phoneme]tags are not supported directly by this CLI wrapper. Always test long responses with a shorter snippet first to ensure the expressive tags interact correctly with the chosen voice model.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-steipete-sag": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: external-api, file-write
Related Skills
swiftui-liquid-glass
Implement, review, or improve SwiftUI features using the iOS 26+ Liquid Glass API. Use when asked to adopt Liquid Glass in new SwiftUI UI, refactor an existing feature to Liquid Glass, or review Liquid Glass usage for correctness, performance, and design alignment.
qmd
Local search/indexing CLI (BM25 + vectors + rerank) with MCP mode.
songsee
Generate spectrograms and feature-panel visualizations from audio with the songsee CLI.
summarize
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
bird
X/Twitter CLI for reading, searching, and posting via cookies or Sweetistics.