openai-whisper
Local speech-to-text with the Whisper CLI (no API key).
Why use this skill?
Use the openai-whisper skill to transcribe and translate audio files locally on your machine. Ensure data privacy with no API keys.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/steipete/openai-whisperWhat This Skill Does
The openai-whisper skill provides a powerful, local interface for speech-to-text transcription using OpenAI's Whisper model. By leveraging the command-line interface (CLI) directly on your machine, this skill bypasses the need for cloud-based API keys, ensuring that your audio data remains private and resides entirely within your local environment. It is ideal for users who prioritize data sovereignty and want to transcribe large volumes of audio files without incurring per-minute API costs.
Installation
To integrate this skill into your OpenClaw environment, execute the following command in your terminal:
clawhub install openclaw/skills/skills/steipete/openai-whisper
Ensure that you have Python installed and the necessary ffmpeg dependencies on your system path, as Whisper relies on these to handle various audio container formats effectively. Upon the first run, the skill will automatically download the required model weights to ~/.cache/whisper.
Use Cases
- Transcription of Private Meetings: Convert sensitive recorded meetings into text files without uploading data to third-party servers.
- Content Creation: Quickly generate transcripts for podcasts, interviews, or lectures to create blog posts or show notes.
- Translation Workflows: Use the built-in translation feature to convert audio in foreign languages into English subtitles or text.
- Archive Management: Automate the indexing of large audio libraries by converting them into searchable text formats.
Example Prompts
- "Transcribe the audio file located at /Users/me/recordings/meeting.mp3 using the medium model and save the output to the current folder."
- "Translate the Spanish audio file in /downloads/interview.m4a to English and export the result as an SRT subtitle file."
- "Transcribe my latest lecture recording in /data/lecture.wav using the fast turbo model to get quick results for my notes."
Tips & Limitations
- Model Selection: The
turbomodel is the default and provides an excellent balance between speed and accuracy. Usebaseorsmallif you are running on hardware with limited RAM, orlarge-v3for high-fidelity transcription of complex audio. - Performance: Larger models will require significantly more compute power and time. Ensure you have sufficient GPU or CPU resources when selecting larger model weights.
- Dependency: This skill is entirely local. You are responsible for managing your local storage, as audio files and transcriptions can consume significant disk space over time. If you encounter issues, verify that
ffmpegis installed and accessible in your system environment variables.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-steipete-openai-whisper": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, file-write, code-execution
Related Skills
swiftui-liquid-glass
Implement, review, or improve SwiftUI features using the iOS 26+ Liquid Glass API. Use when asked to adopt Liquid Glass in new SwiftUI UI, refactor an existing feature to Liquid Glass, or review Liquid Glass usage for correctness, performance, and design alignment.
qmd
Local search/indexing CLI (BM25 + vectors + rerank) with MCP mode.
songsee
Generate spectrograms and feature-panel visualizations from audio with the songsee CLI.
summarize
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
bird
X/Twitter CLI for reading, searching, and posting via cookies or Sweetistics.