Official Verified

transcribe

Transcribe audio files to text using local Whisper (Docker). Use when receiving voice messages, audio files (.mp3, .m4a, .ogg, .wav, .webm), or when asked to transcribe audio content.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/javicasper/transcribe

Download Source Code (.zip)

Transcribe

Local audio transcription using faster-whisper in Docker.

Installation

cd /path/to/skills/transcribe/scripts
chmod +x install.sh
./install.sh

This builds the Docker image whisper:local and installs the transcribe CLI.

Usage

transcribe /path/to/audio.mp3 [language]

Default language: es (Spanish)
Use auto for auto-detection
Outputs plain text to stdout

Examples

transcribe /tmp/voice.ogg          # Spanish (default)
transcribe /tmp/meeting.mp3 en     # English
transcribe /tmp/audio.m4a auto     # Auto-detect

Supported Formats

mp3, m4a, ogg, wav, webm, flac, aac

When Receiving Voice Messages

Save the audio attachment to a temp file
Run transcribe <path>
Include the transcription in your response
Clean up the temp file

Files

scripts/transcribe - CLI wrapper (bash)
scripts/install.sh - Installation script (includes Dockerfile inline)

Notes

Model: small (fast) - edit install.sh for large-v3 (accurate)
Fully local, no API key needed

Read Full Documentation on GitHub

Metadata

Author@javicasper

Stars1947

Updated2026-03-04

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-javicasper-transcribe": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.

Related Skills

Read and search Reddit posts via web scraping of old.reddit.com. Use when Clawdbot needs to browse Reddit content - read posts from subreddits, search for topics, monitor specific communities. Read-only access with no posting or comments.

javicasper 1947

sound-fx

Generate short sound effects via ElevenLabs SFX (text-to-sound). Use when you need SFX clips like applause, canned laughter, whooshes, ambience, or short stingers, and optionally convert to WhatsApp-friendly .ogg/opus.

javicasper 1947