ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

youtube-transcript-pipeline-lite

Run a lightweight YouTube transcript workflow: transcribe, attribution cleanup, translation, and packaging with minimal tooling. Use for repeatable transcript handoff tasks when you need a concise, auditable process over custom automation.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/bluebirdback/youtube-transcript-pipeline-lite
Or

What This Skill Does

The youtube-transcript-pipeline-lite skill is designed as a streamlined, high-fidelity workflow for extracting, cleaning, and packaging YouTube video transcripts. Unlike complex, automated video-to-text pipelines that can be prone to hallucination or timestamp drift, this skill enforces a conservative, manual-first approach to speaker attribution and translation. It focuses on the essential steps: transcribing with word-level precision, applying lightweight speaker corrections, performing accurate line-by-line translations, and organizing the resulting data into a standard project structure. It is built for tasks where auditability is more critical than raw speed, ensuring that every timestamp and attribution change is accounted for.

Installation

To integrate this skill into your OpenClaw environment, execute the following command in your terminal: clawhub install openclaw/skills/skills/bluebirdback/youtube-transcript-pipeline-lite Ensure your current working directory has appropriate write permissions, as the skill will generate folders for transcripts and artifacts upon execution.

Use Cases

  • Journalistic Interviews: Perfect for cleaning up multi-speaker interviews where speaker tags occasionally misidentify the interviewer as the subject.
  • Educational Content Localization: Translate technical video lectures into multiple languages while strictly maintaining the original timestamp/speaker cadence for pedagogical consistency.
  • Content Archiving: Standardize inconsistent transcript formats from different sources into a clean, auditable, and packaged set of assets ready for distribution or research use.

Example Prompts

  1. "Generate a transcript for https://youtube.com/watch?v=example, clean up the speaker labels for the interviewer, and package it."
  2. "Take the transcript in artifacts/raw_transcript.txt and provide a Spanish translation while keeping the [HH:MM:SS] Speaker format identical."
  3. "Package the current transcript and the translated version into a folder structure, ensuring a MANIFEST.txt file is generated for tracking."

Tips & Limitations

  • Conservative Cleanup: The skill is designed to avoid aggressive relabeling. Only fix clear, obvious misattributions to maintain the integrity of the original source.
  • Timestamp Integrity: Never manually alter the timestamps unless absolutely necessary for alignment, as this breaks the audit chain.
  • Scalability: This is a "lite" tool. For massive, multi-hour video datasets, consider modularizing the process by segments to prevent memory overhead.
  • Documentation: Always review the generated MANIFEST.txt to ensure your handoff package is complete before transferring data to final stakeholders.

Metadata

Stars4473
Views0
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-bluebirdback-youtube-transcript-pipeline-lite": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#transcription#youtube#translation#packaging#automation
Safety Score: 4/5

Flags: file-write, file-read, external-api

Related Skills

claude-to-free

Migrate OpenClaw from Claude subscription OAuth to a free or cheap model provider (OpenRouter, Gemini, Ollama). Use when the user says Claude stopped working, gets an auth error, mentions the Anthropic April 2026 subscription ban, or asks to switch models without paying Anthropic more.

bluebirdback 4473

clawhub-publish-doctor

Diagnose and mitigate ClawHub/ClawDHUB publish failures (auth, browser-login, missing dependencies, pending security-scan visibility errors, and wrong profile/skill URLs). Use when publishing skills to ClawHub fails, inspect reports temporary errors, or you need a safer publish+verify workflow with retries.

bluebirdback 4473

exec-clawhub-publish-doctor

Diagnose and mitigate exec-related tooling failures around ClawHub publishing and GitHub CLI queries (auth, browser-login, missing dependencies, pending security-scan visibility errors, wrong profile/skill URLs, and gh JSON-field mismatch errors like Unknown JSON field). Use when publishing skills to ClawHub fails, inspect reports temporary errors, or GitHub CLI search commands fail due to field schema differences.

bluebirdback 4473

claw-history

Provide a chronological history of all actions the agent has taken from the beginning (birth) until now. Use when the user asks for full lifetime timeline/accountability, "from birth until now," "everything you've done so far," "full action log," or equivalent chronological-history requests.

bluebirdback 4473

youtube-transcript-pipeline

Generate, clean, correct, translate, and package YouTube interview transcripts with speaker-attributed timestamps into reusable deliverables. Use for workflows involving Deepgram transcription, diarization correction, bilingual output, and structured folder packaging for handoff.

bluebirdback 1776