Back to Registry View Author Profile
Official Verified
local-vosk
Local speech-to-text using Vosk. Lightweight, fast, fully offline. Perfect for transcribing Telegram voice messages, audio files, or any speech-to-text task without cloud APIs.
skill-install — Terminal
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/sfkiwi/local-voskOr
Local Vosk STT
Lightweight local speech-to-text using Vosk. Fully offline after model download.
Use Cases
- Telegram voice messages — transcribe .ogg voice notes automatically
- Audio files — any format ffmpeg supports
- Offline transcription — no API keys, no cloud, no costs
Quick Start
# Transcribe Telegram voice message
./skills/local-vosk/scripts/transcribe voice_message.ogg
# Transcribe any audio
./skills/local-vosk/scripts/transcribe audio.mp3
# With language (default: en-us)
./skills/local-vosk/scripts/transcribe audio.wav --lang en-us
Supported Formats
Any format ffmpeg can decode: ogg (Telegram), mp3, wav, m4a, webm, flac, etc.
Models
Default model: vosk-model-small-en-us-0.15 (~40MB)
Other models available at https://alphacephei.com/vosk/models
Setup (if not installed)
pip3 install vosk --user --break-system-packages
# Download model
mkdir -p ~/vosk-models && cd ~/vosk-models
wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zip
Notes
- Quality is good for conversational speech
- For higher accuracy, use larger models or faster-whisper
- Processes audio at ~10x realtime on typical hardware
- Telegram voice messages are .ogg format — works out of the box
Metadata
AI Skill Finder
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skill Add to Configuration
Paste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-sfkiwi-local-vosk": {
"enabled": true,
"auto_update": true
}
}
}Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.