gemini-video-analyzer
Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe speech, identify objects and actions. Use when: (1) User sends a video file and wants it analyzed, (2) Video summarization or description needed, (3) Extracting text, UI elements, or information from screen recordings, (4) Answering questions about video content, (5) Comparing multiple videos, (6) Analyzing tutorials, demos, or walkthroughs.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/aiwithabidi/a6-gemini-video-analyzerWhat This Skill Does
The gemini-video-analyzer is a powerful OpenClaw AI agent skill designed to leverage Google's Gemini multimodal capabilities for direct video analysis. Unlike traditional methods that rely on labor-intensive frame extraction or pre-processing, this skill enables the agent to process video files natively. By analyzing video content at 1 frame per second, the skill maintains temporal context, allowing it to understand motion, transitions, and audio-visual cues simultaneously. Whether you are dealing with screen recordings, instructional tutorials, or long-form meetings, this tool turns raw video data into structured, actionable insights without the need for manual transcription or visual review.
Installation
To integrate this skill into your environment, use the OpenClaw command-line interface. Ensure you have your GOOGLE_AI_API_KEY ready, as it is required for authentication with Google's Files API. Run the following command in your terminal:
clawhub install openclaw/skills/skills/aiwithabidi/a6-gemini-video-analyzer
Once installed, configure your environment by adding your API key to your .env file to allow the scripts to communicate with the Gemini models seamlessly.
Use Cases
This skill is highly versatile and serves various professional and personal needs. Use it for:
- Technical Documentation: Automatically extract text and UI elements from software walkthroughs to create written guides.
- Quality Assurance: Submit screen recordings of bugs to let the AI describe exactly what went wrong and identify failure points.
- Content Summarization: Process long meeting recordings to generate concise minutes, key discussion points, and action items.
- Media Comparison: Analyze multiple video files simultaneously to highlight differences, trends, or stylistic variations.
Example Prompts
- "Analyze this screen recording and extract all the error messages shown in the terminal window."
- "Summarize the key takeaways from this hour-long tutorial video and list the steps in bullet points."
- "Compare these two videos and tell me which one displays the more efficient workflow for setting up a database."
Tips & Limitations
- File Limits: Ensure your video files are under 2GB. Supported formats include MP4, MOV, WebM, and more.
- Storage: Files uploaded via the Gemini Files API are temporary and will be automatically purged after 48 hours for your privacy.
- Performance: Use
gemini-2.5-flashfor most tasks to benefit from speed and cost-efficiency. If you require deep reasoning for complex visual tasks, use the--model gemini-2.5-proflag. - Audio: The model excels at understanding audio context, so ensure your source videos have clear audio tracks for the best results.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-aiwithabidi-a6-gemini-video-analyzer": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, external-api
Related Skills
freshsales
Freshsales CRM integration — manage contacts, leads, deals, accounts, tasks, and sales sequences via the Freshsales API. Track deal pipelines, automate lead assignments, log activities, and generate sales reports. Built for AI agents — Python stdlib only, no dependencies. Use for sales CRM, contact management, deal tracking, pipeline reporting, and sales automation.
gemini-video-analyzer
Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe speech, identify objects and actions. Use when: (1) User sends a video file and wants it analyzed, (2) Video summarization or description needed, (3) Extracting text, UI elements, or information from screen recordings, (4) Answering questions about video content, (5) Comparing multiple videos, (6) Analyzing tutorials, demos, or walkthroughs.
agent-memory
Full AI agent memory stack — Mem0 unified memory engine with vector search (Qdrant) and knowledge graph (Neo4j), plus SQLite for structured data. Complete setup script and tools. Give your OpenClaw agent a real brain with semantic recall, entity relationships, and structured storage.
neon
Neon serverless Postgres — manage projects, branches, databases, roles, endpoints, and compute via the Neon API. Create database branches for development, manage connection endpoints, scale compute, and monitor usage. Built for AI agents — Python stdlib only, zero dependencies. Use for serverless Postgres, database branching, database management, development workflows, and cloud database automation.
onepassword
1Password Connect — vaults, items, secrets management for server-side applications.