openai-whisper-api
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Why use this skill?
Seamlessly transcribe audio files to text using OpenAI Whisper with the OpenClaw skill. Fast, accurate, and easy to set up for meetings and voice notes.
Install via CLI (Recommended)
clawhub install openclaw/openclaw/skills/openai-whisper-apiWhat This Skill Does
The openai-whisper-api skill integrates OpenAI’s powerful Whisper speech-to-text engine directly into the OpenClaw environment. By utilizing the /v1/audio/transcriptions endpoint, this skill converts spoken language from various audio formats (such as .m4a, .ogg, .mp3, and more) into accurate, readable text. It functions as a lightweight wrapper around the Whisper model, allowing for seamless transcription without requiring manual API calls or complex script management. Whether you are dealing with short voice notes or longer recorded meetings, this skill automates the transcription process, outputting the results directly to your local file system as either plain text or structured JSON data.
Installation
To integrate this skill into your environment, execute the following command in your terminal:
clawhub install openclaw/openclaw/skills/openai-whisper-api
Once installed, ensure your credentials are set up. You can either export the OPENAI_API_KEY environment variable directly in your shell session or add it to your configuration file located at ~/.openclaw/openclaw.json under the designated skill section. Proper authentication is required to access the OpenAI endpoints successfully.
Use Cases
This skill is ideal for professionals and developers who need to document audio content quickly. Use cases include transcribing meeting minutes for team records, generating subtitles for video content, converting voice memos into actionable task lists, or performing qualitative research by transcribing recorded interviews. Its support for custom prompts allows users to prime the AI with specific terminology, speaker names, or unique vocabulary, ensuring higher accuracy for technical or specialized domains.
Example Prompts
- "OpenClaw, please transcribe the audio file located at /home/user/downloads/meeting_notes.m4a and save the output as notes.txt."
- "Transcribe audio_interview.ogg using the whisper-1 model and export the result as a structured JSON file."
- "Transcribe /projects/audio/briefing.mp3, but make sure to prime the AI with the list of technical acronyms: API, CLI, IDE, and SCM."
Tips & Limitations
- File Limits: OpenAI enforces file size limits for direct API uploads; ensure your audio files are compressed appropriately before processing.
- Precision: Use the
--promptflag effectively. By providing context or names that appear in the audio, you significantly reduce the error rate for specific nouns. - Performance: For long recordings, consider splitting audio files into smaller segments to avoid timeouts or interruption.
- Output Format: Rely on the
--jsonflag if you intend to pipe the transcription into further automated workflows or parsing scripts within OpenClaw.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-openclaw-openai-whisper-api": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, external-api
Related Skills
sag
ElevenLabs text-to-speech with mac-style say UX.
bear-notes
Create, search, and manage Bear notes via grizzly CLI.
mcporter
Use the mcporter CLI to list, configure, auth, and call MCP servers/tools directly (HTTP or stdio), including ad-hoc servers, config edits, and CLI/type generation.
eightctl
Control Eight Sleep pods (status, temperature, alarms, schedules).
xurl
A CLI tool for making authenticated requests to the X (Twitter) API. Use this skill when you need to post tweets, reply, quote, search, read posts, manage followers, send DMs, upload media, or interact with any X API v2 endpoint.