ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 5/5

openai-whisper

Local speech-to-text with the Whisper CLI (no API key).

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/czubi1928/openai-whisper-1-0-0
Or

What This Skill Does

The openai-whisper skill provides a powerful, local interface for speech-to-text transcription using OpenAI's state-of-the-art Whisper models. By leveraging the Whisper CLI, this skill enables you to convert audio files into highly accurate text transcripts without the need for an external API key or internet-dependent cloud services. Because the processing occurs locally on your machine, it ensures data privacy and allows for batch processing of audio files while maintaining full control over the transcription output formats and model precision.

Installation

To integrate this skill into your environment, use the OpenClaw CLI manager. Run the following command in your terminal: clawhub install openclaw/skills/skills/czubi1928/openai-whisper-1-0-0 Ensure that you have the necessary system dependencies for Whisper installed (such as FFmpeg) to ensure compatibility across different audio formats like MP3, M4A, and WAV.

Use Cases

This skill is ideal for professionals, researchers, and developers who handle audio data. Use it to transcribe long-form meetings, interview recordings, or lecture audio into searchable text documents. It is also excellent for generating subtitles or captions for video content through the SRT output feature. Furthermore, it supports real-time translation tasks, making it a versatile tool for content creators looking to make their media accessible to global audiences.

Example Prompts

  1. "Transcribe the meeting audio at ./recordings/project_sync.mp3 and save the output as a text file in the current directory using the medium model."
  2. "Please transcribe the audio file located at /home/user/docs/interview.m4a and provide the output in SRT format."
  3. "Run a translation task on my audio file at ./foreign_lecture.wav and save the output to the root directory."

Tips & Limitations

  • Performance: The accuracy of the transcription depends on the selected model size. While turbo or small models provide fast results for quick notes, large models are recommended for nuanced or multilingual audio.
  • Resource Usage: Larger models require significantly more RAM and GPU memory. Monitor your system resources when processing high-fidelity files.
  • Initial Setup: The first execution will download the selected model to ~/.cache/whisper. Ensure you have sufficient disk space and an active internet connection only during this initial download phase.

Metadata

Author@czubi1928
Stars3409
Views2
Updated2026-03-25
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-czubi1928-openai-whisper-1-0-0": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#transcription#speech-to-text#audio-processing#local-ai#productivity
Safety Score: 5/5

Flags: file-read, file-write