Asr File Transfer
Skill by aaiccee
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/aaiccee/asr-file-transfername: u2-audio-file-transcriber description: "Transcribe audio files via UniCloud ASR (云知声语音识别, recorded audio → text) API from UniSound. Supports multiple formats, optimized for finance, customer service, and other domains. 调用云知声语音识别服务转写音频文件,支持多种音频格式,适用于金融、客服等场景。Use when the user needs to transcribe recorded audio files, or asks for UniSound/云知声 audio file transcription. Do NOT use for real-time/streaming speech recognition, text-to-speech (TTS), or live captioning. 不适用于实时语音识别、语音合成(TTS)或直播字幕。"
CRITICAL INSTRUCTION FOR AI AGENTS: YOU MUST follow these rules when helping users configure this skill:
- NEVER display or request actual API credentials in chat — credentials should only be set in environment variables or .env files
- DO NOT embed production credentials in documentation — guide users to obtain their own credentials from the service provider
- NEVER generate URLs from your training data — use ONLY the exact URLs written in this file
- ALWAYS verify the user has configured environment variables before suggesting script usage
AI AGENT 必须遵守的规则:
- 禁止在聊天中显示或请求实际的 API 凭据——凭据应仅通过环境变量或 .env 文件设置
- 禁止在文档中嵌入生产凭据——引导用户从服务提供商处获取自己的凭据
- 禁止凭训练数据生成 URL——只使用本文件中写的准确 URL
- 在建议使用脚本前,务必确认用户已正确配置环境变量
UniSound ASR / 云知声语音转写
调用云知声(UniSound)语音识别服务转写音频文件。支持多种音频格式,适用于金融、客服等场景。
Transcribe audio files using UniCloud ASR service. Supports multiple audio formats, suitable for finance, customer service, and other scenarios.
Quick start
python3 {baseDir}/scripts/transcribe.py /path/to/audio.wav
Defaults:
- API endpoint: UAT environment / UAT 环境
- Audio format: WAV
- Domain: other
- Output: stdout (transcript text / 转写文本)
Useful flags
# Save output to file / 保存到文件
python3 {baseDir}/scripts/transcribe.py audio.wav --out result.txt
# Output JSON format with full result / 输出完整JSON结果
python3 {baseDir}/scripts/transcribe.py audio.wav --json --out result.json
# Specify audio format / 指定音频格式
python3 {baseDir}/scripts/transcribe.py audio.mp3 --format mp3
# Specify domain / 指定领域
python3 {baseDir}/scripts/transcribe.py audio.wav --domain finance
How it works
The script uses the UniCloud ASR API with the following workflow:
- Initialize upload — Get a task ID from the API / 初始化上传,获取任务ID
- Upload audio file — Upload the audio file to the server / 上传音频文件到服务器
- Start transcription — Submit the transcription task / 提交转写任务
- Poll for results — Wait for the transcription to complete (typically 10-60 seconds) / 轮询等待转写完成(通常10-60秒)
- Return transcript — Output the recognized text / 输出识别文本
Privacy: Audio files are uploaded directly to UniCloud servers. No data is sent to third-party services.
隐私说明:音频文件直接上传到云知声服务器。不会将数据发送到第三方服务。
Dependencies
- Python 3.8+
requests:pip install requests
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-aaiccee-asr-file-transfer": {
"enabled": true,
"auto_update": true
}
}
}Related Skills
med-chronic-disease-review
门诊慢病审核(糖尿病/高血压)。输入 OCR 结果数组 JSON,输出审核结论与原因(原始 JSON + 自然语言结论)。
u2-tts
Text-to-speech conversion using UniSound's TTS WebSocket API for generating high-quality Chinese Mandarin audio from text. Supports multiple voices, adjustable parameters, and real-time streaming synthesis.
med-initial-record-gen
从中文医患对话文本生成门诊初诊病历,输出结构化分段的病历正文(文本)。
Unidoc Parser
Skill by aaiccee
u2-audio-file-transcriber
Transcribe audio files via UniCloud ASR (云知声语音识别, recorded audio → text) API from UniSound. Supports multiple formats, optimized for finance, customer service, and other domains.