Official Verified utilities Safety 5/5

asr

Transcribe audio files to text using local speech recognition. Triggers on: "转录", "transcribe", "语音转文字", "ASR", "识别音频", "把这段音频转成文字".

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/0xfango/marswave-asr

Download Source Code (.zip)

What This Skill Does

The ASR (Automatic Speech Recognition) skill for OpenClaw provides a powerful, privacy-focused solution for transcribing audio files directly on your machine. By leveraging the coli CLI and high-performance local models like SenseVoice or Whisper, the skill eliminates the need for expensive cloud APIs or concerns over data privacy. It supports a wide array of languages including Chinese, English, Japanese, Korean, and Cantonese. Beyond simple transcription, the skill features an optional 'polish' mode that utilizes AI to clean up raw transcripts, removing fillers, correcting punctuation, and enhancing overall readability.

Installation

To install this skill, use the following command in your terminal within the OpenClaw environment: clawhub install openclaw/skills/skills/0xfango/marswave-asr. Ensure you have coli installed globally via npm install -g @marswave/coli and have ffmpeg installed on your system path for optimal file compatibility.

Use Cases

This skill is ideal for professionals, students, and content creators who frequently deal with audio files and need quick, accurate text conversions. Use it to transcribe recorded meetings, lecture audio, voice memos, or interviews. It is particularly effective for multilingual environments where SenseVoice can handle diverse language inputs and emotional sentiment analysis. It should not be used for text-to-speech synthesis or complex audio post-production (like podcast editing), as those require specialized skills.

Example Prompts

"转录这段音频：~/Downloads/meeting_recording.mp3"
"把这个语音文件转成文字，并帮我润色一下：./audio/interview.wav"
"我想使用 sensevoice 模型将此录音识别为文本"

Tips & Limitations

The ASR skill runs entirely offline, meaning your audio data never leaves your machine. For the best accuracy, use the SenseVoice model, as it is optimized for multi-language support and emotional recognition. Be aware that the first time you run the command, the system will download the model weights (~60MB), which may take a moment depending on your internet connection. Ensure your audio file paths are absolute or correctly relative to the project directory to avoid 'file not found' errors.

Read Full Documentation on GitHub

Metadata

Author@0xfango

Stars4473

Updated2026-05-01

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-0xfango-marswave-asr": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#asr#transcription#speech-to-text#offline-ai#productivity

Safety Score: 5/5

Flags: file-read, file-write, code-execution

Related Skills

explainer

Create explainer videos with narration and AI-generated visuals. Triggers on: "解说视频", "explainer video", "explain this as a video", "tutorial video", "introduce X (video)", "解释一下XX（视频形式）".

0xfango 4473

listenhub

Explain anything — turn ideas into podcasts, explainer videos, or voice narration. Use when the user wants to "make a podcast", "create an explainer video", "read this aloud", "generate an image", or share knowledge in audio/visual form. Supports: topic descriptions, YouTube links, article URLs, plain text, and image prompts.

0xfango 4473

listenhub

0xfango 4473

image-gen

Generate AI images from text prompts. Triggers on: "生成图片", "画一张", "AI图", "generate image", "配图", "create picture", "draw", "visualize", "generate an image".

0xfango 4473

content-parser

Extract and parse content from URLs. Triggers on: user provides a URL to extract content from, another skill needs to parse source material, "parse this URL", "extract content", "解析链接", "提取内容".

0xfango 4473