ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

openai-whisper

Local speech-to-text with the Whisper CLI (no API key).

Why use this skill?

Use the openai-whisper skill to transcribe and translate audio files locally on your machine. Ensure data privacy with no API keys.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/steipete/openai-whisper
Or

What This Skill Does

The openai-whisper skill provides a powerful, local interface for speech-to-text transcription using OpenAI's Whisper model. By leveraging the command-line interface (CLI) directly on your machine, this skill bypasses the need for cloud-based API keys, ensuring that your audio data remains private and resides entirely within your local environment. It is ideal for users who prioritize data sovereignty and want to transcribe large volumes of audio files without incurring per-minute API costs.

Installation

To integrate this skill into your OpenClaw environment, execute the following command in your terminal: clawhub install openclaw/skills/skills/steipete/openai-whisper

Ensure that you have Python installed and the necessary ffmpeg dependencies on your system path, as Whisper relies on these to handle various audio container formats effectively. Upon the first run, the skill will automatically download the required model weights to ~/.cache/whisper.

Use Cases

  • Transcription of Private Meetings: Convert sensitive recorded meetings into text files without uploading data to third-party servers.
  • Content Creation: Quickly generate transcripts for podcasts, interviews, or lectures to create blog posts or show notes.
  • Translation Workflows: Use the built-in translation feature to convert audio in foreign languages into English subtitles or text.
  • Archive Management: Automate the indexing of large audio libraries by converting them into searchable text formats.

Example Prompts

  1. "Transcribe the audio file located at /Users/me/recordings/meeting.mp3 using the medium model and save the output to the current folder."
  2. "Translate the Spanish audio file in /downloads/interview.m4a to English and export the result as an SRT subtitle file."
  3. "Transcribe my latest lecture recording in /data/lecture.wav using the fast turbo model to get quick results for my notes."

Tips & Limitations

  • Model Selection: The turbo model is the default and provides an excellent balance between speed and accuracy. Use base or small if you are running on hardware with limited RAM, or large-v3 for high-fidelity transcription of complex audio.
  • Performance: Larger models will require significantly more compute power and time. Ensure you have sufficient GPU or CPU resources when selecting larger model weights.
  • Dependency: This skill is entirely local. You are responsible for managing your local storage, as audio files and transcriptions can consume significant disk space over time. If you encounter issues, verify that ffmpeg is installed and accessible in your system environment variables.

Metadata

Author@steipete
Stars982
Views1
Updated2026-02-14
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-steipete-openai-whisper": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#transcription#speech-to-text#audio-processing#privacy#local-ai
Safety Score: 4/5

Flags: file-read, file-write, code-execution