local-whisper
Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/araa47/local-whisperWhat This Skill Does
The local-whisper skill provides a robust, high-performance speech-to-text (STT) interface for the OpenClaw AI agent, powered by OpenAI's Whisper architecture. Unlike cloud-based transcription services that may impose latency or privacy risks by sending sensitive audio data to external servers, this skill executes entirely offline. Once the model weights are downloaded to your system, the skill remains fully functional without an internet connection. It is designed to process audio files like .wav, .mp3, or .m4a and convert them into highly accurate text transcripts. Users can leverage a variety of model sizes, ranging from the lightweight 'tiny' model for rapid, low-resource environments, to the 'large-v3' model for professional-grade transcription accuracy. This makes it an ideal tool for users who prioritize data sovereignty and local computation.
Installation
To install the skill, execute the following command in your terminal: clawhub install openclaw/skills/skills/araa47/local-whisper. The skill manages its own Python environment using uv, ensuring dependency isolation. The setup process creates a .venv directory within the skill folder, installing necessary libraries like click, openai-whisper, and torch. If you need to perform a fresh install or update dependencies, navigate to ~/.clawdbot/skills/local-whisper and run the uv pip install command provided in the technical documentation to ensure the PyTorch CPU-optimized wheel is correctly linked to your Python 3.12 environment.
Use Cases
- Journaling: Automatically transcribe voice-recorded thoughts into text files for your personal database.
- Meeting Summarization: Process long audio recordings of meetings or interviews to get a searchable text history.
- Accessibility: Convert voice inputs into text to facilitate better interaction with command-line tools and scripts.
- Archiving: Digitize old voice memos or analog recordings into structured text logs with precise timestamps.
Example Prompts
- "Transcribe the file meeting_notes.wav using the turbo model for the best balance of speed and accuracy."
- "Convert the audio recording from my interview into a JSON file, including timestamps for every word detected."
- "Process the audio file 'daily_log.mp3' and save the output text into my current project directory."
Tips & Limitations
For optimal performance, ensure your CPU has sufficient headroom. While 'tiny' and 'base' models run comfortably on modest hardware, the 'large-v3' model requires significant RAM and CPU cycles to function smoothly. Because this tool runs locally, it relies entirely on your hardware capabilities rather than cloud server clusters. If you encounter slow transcription speeds, try switching to the 'small' or 'base' model. Additionally, be aware that while the transcription is accurate, technical jargon or heavily accented speech may yield varying results. Always verify the output if your use case involves critical or sensitive information to ensure high-fidelity transcription quality.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-araa47-local-whisper": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, file-write, code-execution
Related Skills
ez-unifi
Use when asked to manage UniFi network - list/restart/upgrade devices, block/unblock clients, manage WiFi networks, control PoE ports, manage traffic rules, create guest vouchers, or any UniFi controller task. Works with UDM Pro/SE, Dream Machine, Cloud Key Gen2+, or self-hosted controllers.
ez-google
Use when asked to send email, check inbox, read emails, check calendar, schedule meetings, create events, search Google Drive, create Google Docs, read or write spreadsheets, find contacts, or any task involving Gmail, Google Calendar, Drive, Docs, Sheets, Slides, or Contacts. Agent-friendly with hosted OAuth - no API keys needed.
gemini-stt
Transcribe audio files using Google's Gemini API or Vertex AI
local-stt
Local STT with selectable backends - Parakeet (best accuracy) or Whisper (fastest, multilingual).
md-to-pdf
Convert markdown files to clean, formatted PDFs using reportlab