ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

gemini-yt-video-transcript

Create a verbatim transcript for a YouTube URL using Google Gemini (speaker labels, paragraph breaks; no time codes). Use when the user asks to transcribe a YouTube video or wants a clean transcript (no timestamps).

Why use this skill?

Use the gemini-yt-video-transcript skill to create clean, verbatim transcripts from any YouTube video. Perfect for researchers, students, and content creators.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/odrobnik/gemini-yt-video-transcript
Or

What This Skill Does

The gemini-yt-video-transcript skill is a specialized utility designed for OpenClaw users who require high-quality, verbatim text representations of YouTube video content. By leveraging Google Gemini's advanced natural language processing, this skill parses audio content from a provided URL and transforms it into a clean, human-readable transcript. Unlike standard tools that produce messy output cluttered with timestamps and metadata, this skill focuses on readability. It organizes text by speaker and utilizes paragraph breaks to ensure the final document is easy to review, analyze, or archive for future research.

Installation

To integrate this skill into your environment, use the OpenClaw command-line interface. Run the following command in your terminal:

clawhub install openclaw/skills/skills/odrobnik/gemini-yt-video-transcript

Ensure you have the necessary dependencies for Python scripts configured in your workspace to allow the local execution of the provided youtube_transcript.py script.

Use Cases

This skill is ideal for researchers, journalists, and students who need to capture knowledge from video lectures, technical tutorials, or long-form interviews without the distraction of time codes. It is perfect for turning hours of video footage into searchable text databases. Additionally, content creators can use this tool to quickly draft blog posts or summaries based on their own uploaded video content by bypassing the need for manual transcription services.

Example Prompts

  1. "Please transcribe the YouTube video at https://www.youtube.com/watch?v=dQw4w9WgXcQ and save the file to my workspace."
  2. "I need a clean, verbatim transcript of this lecture: https://www.youtube.com/watch?v=example-id. Remove all timestamps and just give me the speaker dialogue."
  3. "Run the gemini-yt-video-transcript skill on this URL and make sure the output is written to the out/ folder."

Tips & Limitations

To get the best results, ensure the YouTube video has clear, audible speech. While Google Gemini is highly accurate, heavy background music or poor recording quality may impact the fidelity of the transcript. Remember that this tool is designed for spoken word content; it will not capture visual elements or on-screen text. If you are processing very long videos (over an hour), consider checking the out/ folder periodically as processing times may vary based on the video length and complexity. Always verify the speaker labels if the video is a complex panel discussion to ensure accurate attribution.

Metadata

Author@odrobnik
Stars1287
Views1
Updated2026-02-22
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-odrobnik-gemini-yt-video-transcript": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#youtube#transcription#gemini#ai#video-to-text
Safety Score: 4/5

Flags: network-access, file-write, external-api, code-execution