ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

video2txt

将本地视频或音频文件转写为 SRT 字幕文件和 TXT 纯文本文件

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/chentx1243/maple-video2txt
Or

What This Skill Does

The video2txt skill is a powerful local utility designed to transform video and audio files into accessible, text-based formats. Utilizing the high-performance faster-whisper library, this tool extracts speech from multimedia files and generates two primary outputs: an SRT subtitle file with precise timestamps and a clean TXT text file. It is specifically optimized for Chinese language processing, featuring automatic conversion to Simplified Chinese. The skill is designed for performance, offering adjustable parameters such as model size, beam size, and hardware acceleration (CPU/CUDA) to balance between speed and transcription accuracy.

Installation

To install this skill, run the following command in your terminal: clawhub install openclaw/skills/skills/chentx1243/maple-video2txt. Ensure you are using Python 3.11 or 3.12. Install the necessary dependencies via pip install -r requirements.txt. Note that the first execution will trigger a model download to the ./models directory, which may require an active network connection and sufficient disk space. Ensure ffmpeg and ffprobe are installed on your system path.

Use Cases

This skill is perfect for creators, researchers, and office professionals. Use cases include:

  • Meeting Transcription: Convert recorded business meetings or video conferences into searchable text documents.
  • Content Creation: Generate accurate SRT files for YouTube or social media videos to improve accessibility and SEO.
  • Educational Research: Extract lecture content from video materials for note-taking and quick review.
  • Archiving: Convert video archives into text logs for easy indexing and retrieval.

Example Prompts

  1. "Convert my meeting recording at D:\recordings\project_kickoff.mp4 into a transcript and subtitle file."
  2. "Use the 'small' whisper model to transcribe D:\videos\interview.mp4 and save the output in the C:\transcripts folder."
  3. "Transcribe my video D:\lecture.mp4 using the base model and make sure it processes on the CPU."

Tips & Limitations

  • Efficiency: Always use background: true when invoking the tool to prevent terminal popups from interrupting your workflow.
  • Monitoring: The script reports progress every 10% to prevent anxiety during long transcriptions.
  • Performance: For faster results, ensure you have a compatible NVIDIA GPU and set --device cuda and --compute-type float16 if your hardware supports it.
  • Dependencies: This skill relies on external binary dependencies (ffmpeg/ffprobe). If transcription fails to start, verify your path settings for these tools.
  • Language: While optimized for Chinese, it supports multiple languages, though performance may vary based on the chosen model size.

Metadata

Stars3840
Views1
Updated2026-04-06
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-chentx1243-maple-video2txt": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#transcription#whisper#video-to-text#subtitles#productivity
Safety Score: 4/5

Flags: file-write, file-read, network-access, code-execution