Official Verified media Safety 4/5

openai-whisper

Local speech-to-text with the Whisper CLI (no API key).

Why use this skill?

Use the openai-whisper skill to transcribe and translate audio files locally on your machine. Ensure data privacy with no API keys.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/steipete/openai-whisper

Download Source Code (.zip)

What This Skill Does

The openai-whisper skill provides a powerful, local interface for speech-to-text transcription using OpenAI's Whisper model. By leveraging the command-line interface (CLI) directly on your machine, this skill bypasses the need for cloud-based API keys, ensuring that your audio data remains private and resides entirely within your local environment. It is ideal for users who prioritize data sovereignty and want to transcribe large volumes of audio files without incurring per-minute API costs.

Installation

To integrate this skill into your OpenClaw environment, execute the following command in your terminal: clawhub install openclaw/skills/skills/steipete/openai-whisper

Ensure that you have Python installed and the necessary ffmpeg dependencies on your system path, as Whisper relies on these to handle various audio container formats effectively. Upon the first run, the skill will automatically download the required model weights to ~/.cache/whisper.

Use Cases

Transcription of Private Meetings: Convert sensitive recorded meetings into text files without uploading data to third-party servers.
Content Creation: Quickly generate transcripts for podcasts, interviews, or lectures to create blog posts or show notes.
Translation Workflows: Use the built-in translation feature to convert audio in foreign languages into English subtitles or text.
Archive Management: Automate the indexing of large audio libraries by converting them into searchable text formats.

Example Prompts

"Transcribe the audio file located at /Users/me/recordings/meeting.mp3 using the medium model and save the output to the current folder."
"Translate the Spanish audio file in /downloads/interview.m4a to English and export the result as an SRT subtitle file."
"Transcribe my latest lecture recording in /data/lecture.wav using the fast turbo model to get quick results for my notes."

Tips & Limitations

Model Selection: The turbo model is the default and provides an excellent balance between speed and accuracy. Use base or small if you are running on hardware with limited RAM, or large-v3 for high-fidelity transcription of complex audio.
Performance: Larger models will require significantly more compute power and time. Ensure you have sufficient GPU or CPU resources when selecting larger model weights.
Dependency: This skill is entirely local. You are responsible for managing your local storage, as audio files and transcriptions can consume significant disk space over time. If you encounter issues, verify that ffmpeg is installed and accessible in your system environment variables.

Read Full Documentation on GitHub

Metadata

Author@steipete

Stars982

Updated2026-02-14

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-steipete-openai-whisper": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#transcription#speech-to-text#audio-processing#privacy#local-ai

Safety Score: 4/5

Flags: file-read, file-write, code-execution

Related Skills

swiftui-liquid-glass

Implement, review, or improve SwiftUI features using the iOS 26+ Liquid Glass API. Use when asked to adopt Liquid Glass in new SwiftUI UI, refactor an existing feature to Liquid Glass, or review Liquid Glass usage for correctness, performance, and design alignment.

steipete 982

qmd

Local search/indexing CLI (BM25 + vectors + rerank) with MCP mode.

steipete 982

songsee

Generate spectrograms and feature-panel visualizations from audio with the songsee CLI.

steipete 982

summarize

Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).

steipete 982

bird

X/Twitter CLI for reading, searching, and posting via cookies or Sweetistics.

steipete 982