youtube-transcript-pipeline-lite
Run a lightweight YouTube transcript workflow: transcribe, attribution cleanup, translation, and packaging with minimal tooling. Use for repeatable transcript handoff tasks when you need a concise, auditable process over custom automation.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/bluebirdback/youtube-transcript-pipeline-liteWhat This Skill Does
The youtube-transcript-pipeline-lite skill is designed as a streamlined, high-fidelity workflow for extracting, cleaning, and packaging YouTube video transcripts. Unlike complex, automated video-to-text pipelines that can be prone to hallucination or timestamp drift, this skill enforces a conservative, manual-first approach to speaker attribution and translation. It focuses on the essential steps: transcribing with word-level precision, applying lightweight speaker corrections, performing accurate line-by-line translations, and organizing the resulting data into a standard project structure. It is built for tasks where auditability is more critical than raw speed, ensuring that every timestamp and attribution change is accounted for.
Installation
To integrate this skill into your OpenClaw environment, execute the following command in your terminal:
clawhub install openclaw/skills/skills/bluebirdback/youtube-transcript-pipeline-lite
Ensure your current working directory has appropriate write permissions, as the skill will generate folders for transcripts and artifacts upon execution.
Use Cases
- Journalistic Interviews: Perfect for cleaning up multi-speaker interviews where speaker tags occasionally misidentify the interviewer as the subject.
- Educational Content Localization: Translate technical video lectures into multiple languages while strictly maintaining the original timestamp/speaker cadence for pedagogical consistency.
- Content Archiving: Standardize inconsistent transcript formats from different sources into a clean, auditable, and packaged set of assets ready for distribution or research use.
Example Prompts
- "Generate a transcript for https://youtube.com/watch?v=example, clean up the speaker labels for the interviewer, and package it."
- "Take the transcript in artifacts/raw_transcript.txt and provide a Spanish translation while keeping the [HH:MM:SS] Speaker format identical."
- "Package the current transcript and the translated version into a folder structure, ensuring a MANIFEST.txt file is generated for tracking."
Tips & Limitations
- Conservative Cleanup: The skill is designed to avoid aggressive relabeling. Only fix clear, obvious misattributions to maintain the integrity of the original source.
- Timestamp Integrity: Never manually alter the timestamps unless absolutely necessary for alignment, as this breaks the audit chain.
- Scalability: This is a "lite" tool. For massive, multi-hour video datasets, consider modularizing the process by segments to prevent memory overhead.
- Documentation: Always review the generated MANIFEST.txt to ensure your handoff package is complete before transferring data to final stakeholders.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-bluebirdback-youtube-transcript-pipeline-lite": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, external-api
Related Skills
claude-to-free
Migrate OpenClaw from Claude subscription OAuth to a free or cheap model provider (OpenRouter, Gemini, Ollama). Use when the user says Claude stopped working, gets an auth error, mentions the Anthropic April 2026 subscription ban, or asks to switch models without paying Anthropic more.
clawhub-publish-doctor
Diagnose and mitigate ClawHub/ClawDHUB publish failures (auth, browser-login, missing dependencies, pending security-scan visibility errors, and wrong profile/skill URLs). Use when publishing skills to ClawHub fails, inspect reports temporary errors, or you need a safer publish+verify workflow with retries.
exec-clawhub-publish-doctor
Diagnose and mitigate exec-related tooling failures around ClawHub publishing and GitHub CLI queries (auth, browser-login, missing dependencies, pending security-scan visibility errors, wrong profile/skill URLs, and gh JSON-field mismatch errors like Unknown JSON field). Use when publishing skills to ClawHub fails, inspect reports temporary errors, or GitHub CLI search commands fail due to field schema differences.
claw-history
Provide a chronological history of all actions the agent has taken from the beginning (birth) until now. Use when the user asks for full lifetime timeline/accountability, "from birth until now," "everything you've done so far," "full action log," or equivalent chronological-history requests.
youtube-transcript-pipeline
Generate, clean, correct, translate, and package YouTube interview transcripts with speaker-attributed timestamps into reusable deliverables. Use for workflows involving Deepgram transcription, diarization correction, bilingual output, and structured folder packaging for handoff.