What This Skill Does

The Audio Transcribe skill leverages the power of OpenAI's Whisper model via the faster-whisper implementation to convert spoken audio files into accurate, readable text. By running entirely on your local machine, it ensures that your voice recordings remain private and secure without the need for external cloud-based API keys or recurring subscription costs. This skill acts as a bridge between raw audio data and your digital workspace, allowing the OpenClaw AI agent to interpret, summarize, or act upon voice commands and recorded memos seamlessly.

Installation

To integrate this skill, ensure you have the OpenClaw environment initialized. First, install the required Python library by running 'pip install faster-whisper' in your terminal. Next, use the clawhub installation command: 'clawhub install openclaw/skills/skills/aktheknight/audio-transcribe'. The system will automatically download the necessary model weights upon your first execution. For optimal performance, ensure your system meets the RAM requirements specified in the model table, as larger models like 'large-v3' demand significant memory resources.

Use Cases

This skill is perfect for professionals and power users who utilize voice-to-text for productivity. You can use it to transcribe meetings, dictate research notes while away from a keyboard, or convert long-form audio lectures into searchable text documents. It is an essential tool for creating transcripts for content creators or for archiving unstructured voice messages from clients into actionable task lists within your AI agent's memory bank.

Example Prompts

"Transcribe the meeting recording located at /root/downloads/meeting.ogg and summarize the key action items."
"Listen to my latest voice note in the inbox and turn it into a draft email response."
"Transcribe the audio file voice_memo_001.ogg and save the output as a Markdown file in my notes folder."

Tips & Limitations

For most general tasks, the 'small' model provides the best balance of speed and accuracy. If you are transcribing high-fidelity audio with multiple speakers, consider switching to 'large-v3' for superior results at the cost of slower processing. Note that transcription speed depends heavily on your hardware; if you find the process lagging, monitor your VRAM usage. This tool is currently optimized for local processing, meaning it cannot transcribe live, real-time streams—only pre-recorded audio files stored on your local disk.

Audio Transcribe

Install via CLI (Recommended)

What This Skill Does

Installation

Use Cases

Example Prompts

Tips & Limitations

Metadata

Tags(AI)