Official Verified

audio-transcriber

Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/bingze00000/audio-transcriber-pro

Download Source Code (.zip)

Purpose

This skill automates audio-to-text transcription with professional Markdown output, extracting rich technical metadata (speakers, timestamps, language, file size, duration) and generating structured meeting minutes and executive summaries. It uses Faster-Whisper or Whisper with zero configuration, working universally across projects without hardcoded paths or API keys.

Inspired by tools like Plaud, this skill transforms raw audio recordings into actionable documentation, making it ideal for meetings, interviews, lectures, and content analysis.

When to Use

Invoke this skill when:

User needs to transcribe audio/video files to text
User wants meeting minutes automatically generated from recordings
User requires speaker identification (diarization) in conversations
User needs subtitles/captions (SRT, VTT formats)
User wants executive summaries of long audio content
User asks variations of "transcribe this audio", "convert audio to text", "generate meeting notes from recording"
User has audio files in common formats (MP3, WAV, M4A, OGG, FLAC, WEBM)

Workflow

Step 0: Discovery (Auto-detect Transcription Tools)

Objective: Identify available transcription engines without user configuration.

Actions:

Run detection commands to find installed tools:

# Check for Faster-Whisper (preferred - 4-5x faster)
if python3 -c "import faster_whisper" 2>/dev/null; then
    TRANSCRIBER="faster-whisper"
    echo "✅ Faster-Whisper detected (optimized)"
# Fallback to original Whisper
elif python3 -c "import whisper" 2>/dev/null; then
    TRANSCRIBER="whisper"
    echo "✅ OpenAI Whisper detected"
else
    TRANSCRIBER="none"
    echo "⚠️  No transcription tool found"
fi

# Check for ffmpeg (audio format conversion)
if command -v ffmpeg &>/dev/null; then
    echo "✅ ffmpeg available (format conversion enabled)"
else
    echo "ℹ️  ffmpeg not found (limited format support)"
fi

If no transcriber found:

Offer automatic installation using the provided script:

echo "⚠️  No transcription tool found"
echo ""
echo "🔧 Auto-install dependencies? (Recommended)"
read -p "Run installation script? [Y/n]: " AUTO_INSTALL

if [[ ! "$AUTO_INSTALL" =~ ^[Nn] ]]; then
    # Get skill directory (works for both repo and symlinked installations)
    SKILL_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
    
    # Run installation script
    if [[ -f "$SKILL_DIR/scripts/install-requirements.sh" ]]; then
        bash "$SKILL_DIR/scripts/install-requirements.sh"
    else
        echo "❌ Installation script not found"
        echo ""
        echo "📦 Manual installation:"
        echo "  pip install faster-whisper  # Recommended"
        echo "  pip install openai-whisper  # Alternative"
        echo "  brew install ffmpeg         # Optional (macOS)"
        exit 1
    fi
    
    # Verify installation succeeded
    if python3 -c "import faster_wh...

Read Full Documentation on GitHub

Metadata

Author@bingze00000

Stars4473

Updated2026-05-01

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-bingze00000-audio-transcriber-pro": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Related Skills

youtube-summarizer

Automatically fetch YouTube video transcripts, generate structured summaries, and send full transcripts to messaging platforms. Detects YouTube URLs and provides metadata, key insights, and downloadable transcripts.

abe238 4473

youtube-transcribe

自动转录 YouTube 视频，生成带时间戳的文字稿

bodysuperman 4190

voice-note-to-midi

Convert voice notes, humming, and melodic audio recordings to quantized MIDI files using ML-based pitch detection and intelligent post-processing

danbennettuk 3376

elevenlabs-stt

使用 ElevenLabs Scribe V2 进行语音转文字。当用户想要语音识别、音频转录、语音转文字，或提到 elevenlabs、scribe 时使用此 skill。

hexiaochun 2387

spaces-listener

Record, transcribe, and summarize X/Twitter Spaces — live or replays. Auto-downloads audio via yt-dlp, transcribes with Whisper, and generates AI summaries.

jamesalmeida 2032