video-transcript-downloader
Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”, “get subtitles”, “get transcript”, or to troubleshoot yt-dlp/ffmpeg and formats/playlists.
Why use this skill?
Efficiently download videos, audio, and clean paragraph transcripts from YouTube and other sites. Simplify your media ingestion with this OpenClaw AI skill.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/steipete/video-transcript-downloaderWhat This Skill Does
The video-transcript-downloader is a powerful command-line interface tool designed to bridge the gap between multimedia web content and your local environment. At its core, the skill manages the complexity of interacting with platforms like YouTube and various sites supported by yt-dlp. It provides two primary functions: clean, readable text extraction and high-quality media retrieval. When handling transcripts, it automatically favors the efficient youtube-transcript-plus method for YouTube links, falling back to raw subtitle extraction via yt-dlp for other sources. It then cleans the data into a single, cohesive paragraph—the ideal format for consumption by Large Language Models or quick reading.
Beyond text, this skill acts as a robust media downloader. It allows you to grab full videos, isolated audio tracks, or standalone subtitle files. Because it interfaces directly with yt-dlp and ffmpeg, it gives you granular control over the output, enabling you to list specific format IDs, remux videos into MP4 containers, or pass arbitrary flags for specialized downloads. It is an essential toolkit for researchers, content creators, and developers who need to ingest video information without manual friction.
Installation
To integrate this skill into your environment, navigate to the source directory and prepare the dependencies:
cd ~/Projects/agent-scripts/skills/video-transcript-downloader && npm ci
Ensure that the underlying system dependencies are present on your machine. You will need yt-dlp and ffmpeg installed. You can install them via Homebrew using brew install yt-dlp ffmpeg and verify their presence by checking their versions via yt-dlp --version and ffmpeg -version.
Use Cases
This skill is perfect for automated research workflows. Use it when you need to:
- Generate concise summaries from long-form YouTube lectures or technical tutorials.
- Archive media content to your local filesystem for offline access or backup.
- Extract clean audio from video files for transcription services or editing workflows.
- Troubleshoot video format issues by listing available stream qualities and re-encoding them via remuxing.
Example Prompts
- "Download the video at this URL https://youtube.com/watch?v=xyz and save it to my Downloads folder."
- "Get the transcript for this video link as a clean paragraph, but make sure to include the timestamps."
- "Extract the audio only from this clip and save it to my Desktop, keeping the file format as high quality as possible."
Tips & Limitations
By default, the skill strips bracketed cues like [Music] or [Applause] to maintain a clean reading experience. If you are conducting accessibility research, use the --keep-brackets flag. For transcript generation, timestamps are disabled by default; only request them when strictly necessary to keep your context window clean. Remember that complex format operations should be passed after the -- separator to ensure they are correctly parsed as yt-dlp arguments rather than skill-specific flags.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-steipete-video-transcript-downloader": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-write, file-read, code-execution
Related Skills
swiftui-liquid-glass
Implement, review, or improve SwiftUI features using the iOS 26+ Liquid Glass API. Use when asked to adopt Liquid Glass in new SwiftUI UI, refactor an existing feature to Liquid Glass, or review Liquid Glass usage for correctness, performance, and design alignment.
qmd
Local search/indexing CLI (BM25 + vectors + rerank) with MCP mode.
songsee
Generate spectrograms and feature-panel visualizations from audio with the songsee CLI.
summarize
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
bird
X/Twitter CLI for reading, searching, and posting via cookies or Sweetistics.