bilibili-transcript
Transcribe Bilibili videos to text with high accuracy using Whisper medium model. Use when the user provides a Bilibili video URL (BVxxxxx) and wants to: (1) Extract the complete audio content as text with high accuracy, (2) Get a detailed summary of the video content, (3) Save the transcript as a formatted TXT file instead of posting long text to Discord. Automatically detects CC subtitles if available, otherwise uses Whisper medium model with GPU acceleration. Output saves to 'Bilibili transcript' folder by default, includes video metadata, summary section, and full transcript in Simplified Chinese.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/54lynnn/bilibili-transcriptWhat This Skill Does
The Bilibili Transcript skill is an advanced transcription tool designed for OpenClaw agents to extract, process, and summarize content from Bilibili videos. It operates as a multi-tier pipeline that ensures the highest possible accuracy by checking available resources in order: it first looks for human-provided closed captions (CC), then searches for platform-native AI-generated subtitles across nine supported languages, and finally falls back to a powerful local Whisper medium model powered by GPU acceleration. This ensures that even videos without pre-existing subtitles can be converted into accurate text. The output is saved into a structured file that includes comprehensive video metadata (author, title, duration, date), a dedicated summary section, and the complete, cleaned-up transcript in Simplified Chinese.
Installation
To integrate this skill into your OpenClaw environment, use the following command in your terminal:
clawhub install openclaw/skills/skills/54lynnn/bilibili-transcript
Ensure your system meets the requirements, including FFmpeg for audio processing, yt-dlp for media extraction, and the Whisper framework. For the best experience, configure WSL Chromium to allow the script to access your Bilibili session cookies, which enables the extraction of member-only or age-restricted content.
Use Cases
- Academic Research: Rapidly extract data from long-form educational Bilibili lectures for review and study.
- Content Creation: Convert video scripts into text documents for article drafting or social media repurposing.
- Language Learning: Use the multi-language AI subtitle detection to study content in Japanese, Korean, English, or other supported languages.
- Offline Archiving: Maintain a text-searchable database of your favorite video content organized in your local file system.
Example Prompts
- "Can you transcribe this Bilibili video: https://www.bilibili.com/video/BV1xX4y1L7tV? I need a detailed summary and the full text saved to my transcript folder."
- "Please get the transcript for this video (BV1mD4y1k7eP) and make sure the output is in Simplified Chinese."
- "Summarize the content of this Bilibili video link and save the full text to a TXT file for my research notes."
Tips & Limitations
- Cookie Authentication: Always ensure you are logged into Bilibili within your WSL browser; otherwise, the tool may fail to access member-only videos.
- Performance: While the Whisper medium model is highly accurate, it is resource-intensive. Ensure your machine has adequate VRAM (12GB+ recommended) and enough disk space for audio processing.
- Language: The tool is optimized for Simplified Chinese output. Ensure OpenCC is installed if you need specific character set conversions from original Traditional Chinese sources.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-54lynnn-bilibili-transcript": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, network-access, code-execution