ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

openai-tts

Text-to-speech via OpenAI Audio Speech API.

Why use this skill?

Convert text to high-quality human speech with the OpenAI TTS skill for OpenClaw. Supports multiple voices, file formats, and adjustable speeds.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/pors/openai-tts
Or

What This Skill Does

The OpenAI TTS skill provides a robust interface for the OpenClaw AI agent to convert written text into high-quality, human-sounding speech using OpenAI's powerful audio generation engine. It leverages the /v1/audio/speech API to process natural language input and deliver synthesized audio in various formats, including mp3, opus, aac, flac, wav, and pcm. The skill is highly configurable, allowing users to switch between fast inference models like 'tts-1' and high-fidelity 'tts-1-hd' models. With support for multiple voice profiles—ranging from the neutral 'alloy' to the authoritative 'onyx'—this skill transforms static text logs or AI responses into audible, engaging content suitable for diverse applications.

Installation

To install this skill, run the following command in your terminal within the OpenClaw environment: clawhub install openclaw/skills/skills/pors/openai-tts. After installation, ensure your API credentials are configured properly. You can either export the OPENAI_API_KEY environment variable in your shell session or define it permanently within your ~/.clawdbot/clawdbot.json configuration file under the skills entry for openai-tts. Once the key is set, the skill is ready for immediate invocation via the speak.sh script located in your base directory.

Use Cases

This skill is ideal for accessibility, content creation, and real-time feedback loops. Developers can use it to add audio responses to terminal-based CLI tools, enabling hands-free operation. Content creators can quickly generate voice-overs for short videos or prototypes by piping text through the script. It is also perfect for notification systems where hearing a spoken alert provides better context than a simple beep. Educational applications may utilize the varying voice styles to simulate character dialogue or narrate long-form text summaries effectively.

Example Prompts

  1. "Speak the following text aloud using the nova voice: 'Your system backup has completed successfully.'"
  2. "Generate an mp3 file named 'morning_briefing.mp3' containing the text of my latest daily summary with the speed set to 1.1."
  3. "Narrate this weather report using the British-accented fable voice and save it as an opus file."

Tips & Limitations

When using the OpenAI TTS skill, always be mindful of the cost implications. The 'tts-1-hd' model provides superior quality but costs twice as much as the 'tts-1' model, which is optimized for speed and cost-efficiency. Experiment with the --speed flag, which supports values between 0.25 and 4.0, to find the perfect cadence for your audio files. Note that while the tool is versatile, it requires active network access to reach OpenAI's servers; ensure your firewall or proxy allows these requests. Always store your sensitive API keys securely rather than hardcoding them into scripts to maintain system safety.

Metadata

Author@pors
Stars1217
Views1
Updated2026-02-20
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-pors-openai-tts": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#tts#openai#speech-synthesis#audio#voice
Safety Score: 4/5

Flags: network-access, file-write, external-api