ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

gemini-video-analyzer

Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe speech, identify objects and actions. Use when: (1) User sends a video file and wants it analyzed, (2) Video summarization or description needed, (3) Extracting text, UI elements, or information from screen recordings, (4) Answering questions about video content, (5) Comparing multiple videos, (6) Analyzing tutorials, demos, or walkthroughs.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/aiwithabidi/a6-gemini-video-analyzer
Or

What This Skill Does

The gemini-video-analyzer is a powerful OpenClaw AI agent skill designed to leverage Google's Gemini multimodal capabilities for direct video analysis. Unlike traditional methods that rely on labor-intensive frame extraction or pre-processing, this skill enables the agent to process video files natively. By analyzing video content at 1 frame per second, the skill maintains temporal context, allowing it to understand motion, transitions, and audio-visual cues simultaneously. Whether you are dealing with screen recordings, instructional tutorials, or long-form meetings, this tool turns raw video data into structured, actionable insights without the need for manual transcription or visual review.

Installation

To integrate this skill into your environment, use the OpenClaw command-line interface. Ensure you have your GOOGLE_AI_API_KEY ready, as it is required for authentication with Google's Files API. Run the following command in your terminal:

clawhub install openclaw/skills/skills/aiwithabidi/a6-gemini-video-analyzer

Once installed, configure your environment by adding your API key to your .env file to allow the scripts to communicate with the Gemini models seamlessly.

Use Cases

This skill is highly versatile and serves various professional and personal needs. Use it for:

  • Technical Documentation: Automatically extract text and UI elements from software walkthroughs to create written guides.
  • Quality Assurance: Submit screen recordings of bugs to let the AI describe exactly what went wrong and identify failure points.
  • Content Summarization: Process long meeting recordings to generate concise minutes, key discussion points, and action items.
  • Media Comparison: Analyze multiple video files simultaneously to highlight differences, trends, or stylistic variations.

Example Prompts

  1. "Analyze this screen recording and extract all the error messages shown in the terminal window."
  2. "Summarize the key takeaways from this hour-long tutorial video and list the steps in bullet points."
  3. "Compare these two videos and tell me which one displays the more efficient workflow for setting up a database."

Tips & Limitations

  • File Limits: Ensure your video files are under 2GB. Supported formats include MP4, MOV, WebM, and more.
  • Storage: Files uploaded via the Gemini Files API are temporary and will be automatically purged after 48 hours for your privacy.
  • Performance: Use gemini-2.5-flash for most tasks to benefit from speed and cost-efficiency. If you require deep reasoning for complex visual tasks, use the --model gemini-2.5-pro flag.
  • Audio: The model excels at understanding audio context, so ensure your source videos have clear audio tracks for the best results.

Metadata

Stars4473
Views8
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-aiwithabidi-a6-gemini-video-analyzer": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#video-analysis#ai-vision#multimodal#automation#gemini
Safety Score: 4/5

Flags: file-read, external-api

Related Skills

freshsales

Freshsales CRM integration — manage contacts, leads, deals, accounts, tasks, and sales sequences via the Freshsales API. Track deal pipelines, automate lead assignments, log activities, and generate sales reports. Built for AI agents — Python stdlib only, no dependencies. Use for sales CRM, contact management, deal tracking, pipeline reporting, and sales automation.

aiwithabidi 4473

gemini-video-analyzer

Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe speech, identify objects and actions. Use when: (1) User sends a video file and wants it analyzed, (2) Video summarization or description needed, (3) Extracting text, UI elements, or information from screen recordings, (4) Answering questions about video content, (5) Comparing multiple videos, (6) Analyzing tutorials, demos, or walkthroughs.

aiwithabidi 4473

agent-memory

Full AI agent memory stack — Mem0 unified memory engine with vector search (Qdrant) and knowledge graph (Neo4j), plus SQLite for structured data. Complete setup script and tools. Give your OpenClaw agent a real brain with semantic recall, entity relationships, and structured storage.

aiwithabidi 4473

neon

Neon serverless Postgres — manage projects, branches, databases, roles, endpoints, and compute via the Neon API. Create database branches for development, manage connection endpoints, scale compute, and monitor usage. Built for AI agents — Python stdlib only, zero dependencies. Use for serverless Postgres, database branching, database management, development workflows, and cloud database automation.

aiwithabidi 4473

onepassword

1Password Connect — vaults, items, secrets management for server-side applications.

aiwithabidi 4473