ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

youtube-transcriber

One-command YouTube video transcription. Automatically downloads audio and transcribes using OpenAI Whisper API — works even when YouTube subtitles are disabled. Use when asked to "transcribe this video", "get transcript", "what does this video say", or when YouTube captions are unavailable.

Why use this skill?

Convert any YouTube video to text instantly. Uses OpenAI Whisper for accurate transcripts even when captions are disabled. Install today.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/edisonchenai/youtube-transcriber
Or

What This Skill Does

The YouTube Transcriber skill provides a robust, one-command solution for converting any YouTube video into text. Whether you need to extract information from a lecture, summarize a long-form video, or simply have a reference copy of spoken dialogue, this tool streamlines the process. It is highly intelligent: it first checks for native YouTube subtitles to save you time and cost. If those are missing or disabled, it seamlessly pivots to downloading the audio and processing it through the OpenAI Whisper API to generate an accurate transcript. This ensures you get high-quality results for 99+ languages, regardless of the video's original accessibility settings.

Installation

To get started, ensure you have the necessary system-level dependencies installed. The script relies on yt-dlp for video/audio extraction and ffmpeg for audio conversion, both of which can be installed via Homebrew (brew install yt-dlp ffmpeg) or pip/package managers. You must also have your OPENAI_API_KEY exported as an environment variable in your terminal session or shell configuration file. Once prerequisites are met, install the skill via the OpenClaw hub: clawhub install openclaw/skills/skills/edisonchenai/youtube-transcriber.

Use Cases

  • Academic Research: Extracting transcripts from educational videos or webinars for study notes.
  • Content Creation: Repurposing video content into blog posts or newsletters.
  • Accessibility: Creating text transcripts for hearing-impaired users or for SEO indexing purposes.
  • Efficiency: Quickly searching through lengthy video content by keywords without watching the entire video.

Example Prompts

  1. "Transcribe this video for me: https://www.youtube.com/watch?v=dQw4w9WgXcQ"
  2. "Can you give me a full transcript of the video I just sent, and please save it to my notes folder?"
  3. "What does the speaker say in this clip? Please provide a high-accuracy transcript using the Whisper API."

Tips & Limitations

  • Cost Efficiency: Always leverage native subtitles when possible, as they are free. Use the --force-whisper flag only when high accuracy is required or native captions are insufficient.
  • API Limits: Keep in mind that Whisper API usage costs roughly $0.006 per minute. Long videos may take slightly longer to process as audio is compressed to fit the 25MB upload constraint.
  • Maintenance: If you receive a 403 error, run pip install -U yt-dlp immediately to update your extraction engine, as YouTube frequently updates their anti-scraping measures.

Metadata

Stars2387
Views0
Updated2026-03-09
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-edisonchenai-youtube-transcriber": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#transcription#youtube#whisper#media-processing#productivity
Safety Score: 4/5

Flags: network-access, file-write, file-read, external-api

Related Skills

reddit-assistant

Reddit content creation assistant for indie developers and product builders. Creates authentic posts, researches communities, tracks real performance data via Reddit API. Triggers on: "write reddit post", "draft reddit", "post to reddit", "reddit content", "find subreddits for", "which subreddits", "check reddit performance", "reddit analytics", "reddit results", "log reddit post", "reddit post ideas", "reddit strategy"

edisonchenai 2387

protea-Self-evolving life agent

Self-evolving artificial life agent. Three-ring architecture: Ring 0 (Sentinel) supervises, Ring 1 (Intelligence) drives LLM-powered evolution, Ring 2 (Evolvable Code) is the living program that self-restructures, self-reproduces, and self-evolves. Supports Anthropic, OpenAI, DeepSeek, and Qwen as LLM providers. Includes fitness scoring, gene pool inheritance, tiered memory, skill crystallization, Telegram bot, and web dashboard.

edisonchenai 2387

Edison Autopilot Post X

Skill by edisonchenai

edisonchenai 2387

edison-youtube-full

Complete YouTube toolkit for agents: search videos, fetch metadata, browse channels and playlists, and pull transcripts. Use when you need comprehensive YouTube Data API access (search, channels, playlists) plus transcript extraction in a single workflow.

edisonchenai 2387

edison-agent-reach

Use the internet: search, read, and interact with 13+ platforms including Twitter/X, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu (小红书), Douyin (抖音), WeChat Articles (微信公众号), LinkedIn, Boss直聘, RSS, Exa web search, and any web page. Use when: (1) user asks to search or read any of these platforms, (2) user shares a URL from any supported platform, (3) user asks to search the web, find information online, or research a topic, (4) user asks to post, comment, or interact on supported platforms, (5) user asks to configure or set up a platform channel. Triggers: "搜推特", "搜小红书", "看视频", "搜一下", "上网搜", "帮我查", "全网搜索", "search twitter", "read tweet", "youtube transcript", "search reddit", "read this link", "看这个链接", "B站", "bilibili", "抖音视频", "微信文章", "公众号", "LinkedIn", "GitHub issue", "RSS", "search online", "web search", "find information", "research", "帮我配", "configure twitter", "configure proxy", "帮我安装".

edisonchenai 2387