gemini-yt-video-transcript
Create a verbatim transcript for a YouTube URL using Google Gemini (speaker labels, paragraph breaks; no time codes). Use when the user asks to transcribe a YouTube video or wants a clean transcript (no timestamps).
Why use this skill?
Use the gemini-yt-video-transcript skill to create clean, verbatim transcripts from any YouTube video. Perfect for researchers, students, and content creators.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/odrobnik/gemini-yt-video-transcriptWhat This Skill Does
The gemini-yt-video-transcript skill is a specialized utility designed for OpenClaw users who require high-quality, verbatim text representations of YouTube video content. By leveraging Google Gemini's advanced natural language processing, this skill parses audio content from a provided URL and transforms it into a clean, human-readable transcript. Unlike standard tools that produce messy output cluttered with timestamps and metadata, this skill focuses on readability. It organizes text by speaker and utilizes paragraph breaks to ensure the final document is easy to review, analyze, or archive for future research.
Installation
To integrate this skill into your environment, use the OpenClaw command-line interface. Run the following command in your terminal:
clawhub install openclaw/skills/skills/odrobnik/gemini-yt-video-transcript
Ensure you have the necessary dependencies for Python scripts configured in your workspace to allow the local execution of the provided youtube_transcript.py script.
Use Cases
This skill is ideal for researchers, journalists, and students who need to capture knowledge from video lectures, technical tutorials, or long-form interviews without the distraction of time codes. It is perfect for turning hours of video footage into searchable text databases. Additionally, content creators can use this tool to quickly draft blog posts or summaries based on their own uploaded video content by bypassing the need for manual transcription services.
Example Prompts
- "Please transcribe the YouTube video at https://www.youtube.com/watch?v=dQw4w9WgXcQ and save the file to my workspace."
- "I need a clean, verbatim transcript of this lecture: https://www.youtube.com/watch?v=example-id. Remove all timestamps and just give me the speaker dialogue."
- "Run the gemini-yt-video-transcript skill on this URL and make sure the output is written to the out/ folder."
Tips & Limitations
To get the best results, ensure the YouTube video has clear, audible speech. While Google Gemini is highly accurate, heavy background music or poor recording quality may impact the fidelity of the transcript. Remember that this tool is designed for spoken word content; it will not capture visual elements or on-screen text. If you are processing very long videos (over an hour), consider checking the out/ folder periodically as processing times may vary based on the video length and complexity. Always verify the speaker labels if the video is a complex panel discussion to ensure accurate attribution.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-odrobnik-gemini-yt-video-transcript": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-write, external-api, code-execution
Related Skills
elevenlabs
Text-to-speech, sound effects, music generation, voice management, and quota checks via the ElevenLabs API. Use when generating audio with ElevenLabs or managing voices.
tesla-fleet-api
Use when integrating with Tesla's official Fleet API to read vehicle/energy device data or issue remote commands (e.g. start HVAC preconditioning, wake vehicle, charge controls). Covers onboarding (developer app registration, regions/base URLs), OAuth token flows (third-party + partner tokens, refresh rotation), required domain/public-key hosting, and using Tesla's official vehicle-command/tesla-http-proxy for signed vehicle commands.
unifi
Monitor UniFi network infrastructure via the UniFi Site Manager API. Use to list hosts/sites/devices/APs and get high-level client/device counts.
codexmonitor
List/inspect/watch local OpenAI Codex sessions (CLI + VS Code) using the CodexMonitor Homebrew formula. Reads sessions from ~/.codex/sessions by default (or via CODEX_SESSIONS_DIR / CODEX_HOME overrides). Requires the cocoanetics/tap Homebrew tap.
printer
Print images and PDFs to any CUPS printer. PPD-aware: reads paper sizes, margins, resolution, and duplex at runtime. Use when the user wants to print files (images like PNG/JPG or PDFs) or query printer capabilities.