ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

TubeScribe

YouTube video summarizer with speaker detection, formatted documents, and audio output. Works out of the box with macOS built-in TTS. Optional recommended tools (pandoc, ffmpeg, mlx-audio) enhance quality. Requires internet for YouTube access. No paid APIs or subscriptions. Use when user sends a YouTube URL or asks to summarize/transcribe a YouTube video.

Why use this skill?

Transform any YouTube video into a polished document with speaker detection, timestamps, and audio summaries. Free, local, and private processing.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/matusvojtek/tubescribe
Or

What This Skill Does

TubeScribe transforms any YouTube video into an accessible, structured knowledge base. It acts as an automated researcher that handles transcription, speaker diarization, and summary generation entirely on your local machine. By leveraging YouTube's metadata and captions, it generates high-quality text documents—available in DOCX, HTML, or Markdown—paired with audio summaries for portable listening. It is designed to handle long-form content like interviews, educational lectures, and news, making them searchable and easier to digest without needing to watch the video from start to finish.

Installation

Ensure your system has Python 3 installed. Run the setup script to verify all local dependencies: python skills/tubescribe/scripts/setup.py This script checks for essential tools including pandoc for document formatting, ffmpeg for media handling, and the Kokoro TTS engine for generating audio summaries. You can also install the skill package directly using the OpenClaw hub command: clawhub install openclaw/skills/skills/matusvojtek/tubescribe.

Use Cases

  • Academic Research: Quickly summarize long university lectures and extract key timestamps for specific topics.
  • Content Curation: Generate concise written reports from hours-long podcast interviews for later review.
  • Accessibility: Create audio summaries for users who prefer listening to content rather than watching visual media.
  • Corporate Meetings: Process recorded YouTube meetings or webinars into formatted documents with identified speakers and key quotes.

Example Prompts

  1. "Summarize this video and give me the key points: [YouTube URL]"
  2. "Create a transcript of this interview, label the speakers, and generate an audio summary for my commute: [YouTube URL]"
  3. "Can you watch this lecture for me, write a document with clickable timestamps, and extract the main quotes? [YouTube URL]"

Tips & Limitations

  • Non-Blocking Workflow: TubeScribe runs as a sub-agent. Feel free to continue chatting with OpenClaw while the transcription processes in the background.
  • Privacy First: Because all processing happens locally, your data is never sent to external servers for transcription or summarization.
  • Limitations: The skill requires internet access to fetch video metadata and captions. Ensure you are connected to the network when requesting a new summary. Results depend on the quality of the video's original captions.

Metadata

Stars1401
Views1
Updated2026-02-24
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-matusvojtek-tubescribe": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#transcription#youtube#summarizer#productivity#media
Safety Score: 4/5

Flags: network-access, file-write, file-read, code-execution