video-analyzer
Download, transcribe, and analyze videos from YouTube, X/Twitter, and TikTok with local Whisper processing. Perfect for extracting TL;DRs, timestamps, and actionable insights.
Why use this skill?
Analyze, transcribe, and summarize YouTube, Twitter, and TikTok videos locally with OpenClaw. Extract key moments and actionable insights using Whisper-powered AI.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/minilozio/video-analyzer-skillWhat This Skill Does
The Video Analyzer is a powerful multimedia processing tool designed to bridge the gap between raw video content and actionable insights. By integrating yt-dlp for efficient metadata extraction and whisper-cpp for high-performance local speech-to-text, this skill allows OpenClaw to process content from YouTube, Twitter, and TikTok without needing cloud-based subscription services. It handles the heavy lifting of downloading, transcribing, and intelligently structuring information, turning hours of video into concise, readable summaries.
Installation
To add this capability to your agent, run the following command in your terminal:
clawhub install openclaw/skills/skills/minilozio/video-analyzer-skill
Ensure your local environment supports the necessary Python dependencies via uv as defined in the skill's environment configuration.
Use Cases
- Research & Learning: Instantly summarize long-form educational videos or lectures to identify core concepts without sitting through the entire runtime.
- Content Creation: Extract specific quotes, timestamps, or key talking points from social media clips to repurpose them for blog posts or articles.
- Media Archiving: Automatically download and backup important video or audio content locally to your desktop for offline access.
- Language Learning: Transcribe foreign language content into text to aid in comprehension and vocabulary building.
Example Prompts
- "Summarize this YouTube link for me: [URL]. I need the key points and actionable takeaways in a markdown format."
- "I'm looking for the specific part in this video where the speaker discusses the new API release. Can you find the timestamp for me?"
- "Download the audio from this TikTok link to my desktop so I can listen to it later."
Tips & Limitations
- Quality Control: Use the
--quality maxflag for critical accuracy, such as transcribing technical tutorials or legal discussions where every word matters. - Multilingualism: The skill is natively multilingual. If you are a non-English speaker, your agent will detect the transcript's content and translate the summary into your preferred language.
- Performance: Local Whisper processing is hardware-dependent. Long videos will require more time and system resources. If your machine is low on RAM, stick to the
normalquality setting for everyday tasks.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-minilozio-video-analyzer-skill": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, network-access
Related Skills
x-research
X/Twitter research skill powered by TwitterAPI.io. Agentic search, profile analysis, thread reading, watchlists, and sourced briefings. Use when asked to search X/Twitter, check what people are saying about a topic, monitor accounts, or research crypto/tech narratives on X.
agent-arena
Participate in Agent Arena chat rooms with your real personality (SOUL.md + MEMORY.md). Auto-polls for turns and responds as your true self.
tweet-composer
Score and optimize tweets based on X's real open-source ranking algorithm. Analyzes draft tweets against the actual ranking code — not generic tips. Use when: composing tweets, optimizing drafts for reach, planning threads, analyzing why a tweet performed well/poorly, or asking for posting strategy advice.
nano-banana-prompting-skill
Transform natural language image requests into optimized structured prompts for Gemini image generation. Automatically detects style and builds the perfect prompt — cinematic, illustration, anime, 3D, watercolor, product, and more.