Official Verified

u2-audio-file-transcriber

Transcribe audio files via UniCloud ASR (云知声语音识别, recorded audio → text) API from UniSound. Supports multiple formats, optimized for finance, customer service, and other domains.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/aaiccee/u2-audio-file-transcriber

Download Source Code (.zip)

UniSound ASR Audio File Transcriber

云知声语音转写

调用云知声（UniSound）语音识别服务转写音频文件。支持多种音频格式，适用于金融、客服等场景。

Transcribe audio files using UniCloud ASR service. Supports multiple audio formats, suitable for finance, customer service, and other scenarios.

When to Use This Skill

Use UniSound ASR for:

Transcribing pre-recorded audio files (WAV, MP3, M4A, FLAC, OGG)
Converting customer service recordings to text
Processing financial audio records
Audio file transcription up to 2 hours, 100MB
Domain-specific transcription (finance, customer service, general)

Do NOT use for:

Real-time/streaming speech recognition
Text-to-speech (TTS) synthesis
Live captioning or real-time transcription
Voice conversation or dialogue systems

Use when: The user needs to transcribe recorded audio files, or asks for UniSound/云知声 audio file transcription. 不适用于: 实时语音识别、语音合成(TTS)或直播字幕。

Installation

Install Python dependencies before using this skill. From the skill directory (skills/asr-file-transfer-tools):

pip install -r scripts/requirements.txt

Requires Python 3.8+.

How to Use This Skill

⛔ MANDATORY RESTRICTIONS - DO NOT VIOLATE ⛔

ONLY use UniSound ASR API - Execute the script python3 scripts/transcribe.py
NEVER transcribe audio directly - Do NOT attempt local transcription
NEVER offer alternatives - Do NOT suggest "I can try another method" or similar
IF API fails - Display the error message and STOP immediately
NO fallback methods - Do NOT attempt audio transcription any other way

If the script execution fails (API not configured, network error, etc.):

Show the error message to the user
Do NOT offer to help using your audio analysis capabilities
Do NOT ask "Would you like me to try transcribing it?"
Simply stop and wait for user to fix the configuration

Basic Workflow

Execute audio transcription:
```
python3 scripts/transcribe.py /path/to/audio.wav
```
Command options:
- --format FORMAT - Audio format (wav, mp3, m4a, flac, ogg)
- --domain DOMAIN - Recognition domain (finance, customer_service, other)
- --out FILE - Save output to file instead of stdout
- --json - Output JSON format with full result
- --userid ID - Custom user ID
Output:
- Default: Text transcript printed to stdout
- With --out: Transcript saved to specified file
- With --json: Full JSON result with metadata

Understanding the Output

Text Format:

Plain transcript of the audio content
Sentence segmentation preserved
Timestamps included in JSON mode

JSON Format:

Complete transcription result with metadata
Confidence scores for each segment
Timestamp information
Recognition details

Usage Examples

Example 1: Quick Transcription

python3 scripts/transcribe.py recording.wav

Output: Transcript text printed to console

Read Full Documentation on GitHub

Metadata

Author@aaiccee

Stars4473

Updated2026-05-01

View Author Profile

AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill

Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-aaiccee-u2-audio-file-transcriber": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.

Related Skills

Asr File Transfer

Skill by aaiccee

aaiccee 4473

med-chronic-disease-review

门诊慢病审核（糖尿病/高血压）。输入 OCR 结果数组 JSON，输出审核结论与原因（原始 JSON + 自然语言结论）。

aaiccee 4473

u2-tts

Text-to-speech conversion using UniSound's TTS WebSocket API for generating high-quality Chinese Mandarin audio from text. Supports multiple voices, adjustable parameters, and real-time streaming synthesis.

aaiccee 4473

med-initial-record-gen

从中文医患对话文本生成门诊初诊病历，输出结构化分段的病历正文（文本）。

aaiccee 4473

Unidoc Parser

Skill by aaiccee

aaiccee 4473