ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified ai models Safety 4/5

alicloud-ai-audio-asr

Transcribe non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/cinience/alicloud-ai-audio-asr
Or

What This Skill Does

The alicloud-ai-audio-asr skill provides a powerful, professional-grade bridge to Alibaba Cloud's Model Studio Qwen ASR (Automatic Speech Recognition) services. It enables the OpenClaw AI agent to convert spoken audio into precise, punctuated text. This skill supports both real-time, synchronous transcription for short audio clips and sophisticated asynchronous workflows for long-form recordings such as interviews, lectures, or lengthy meetings. By leveraging models like qwen3-asr-flash and qwen3-asr-flash-filetrans, this tool ensures high-accuracy transcription performance integrated directly into your agentic workflow.

Installation

To integrate this skill, use the command provided by your agent manager: clawhub install openclaw/skills/skills/cinience/alicloud-ai-audio-asr. Ensure that you have set your DASHSCOPE_API_KEY in your environment or added it to your ~/.alibabacloud/credentials file. The skill uses standard Python libraries, so no additional heavy dependencies are required beyond the core OpenClaw environment. Run the validation script provided in the documentation to verify your setup: mkdir -p output/alicloud-ai-audio-asr && python -m py_compile skills/ai/audio/alicloud-ai-audio-asr/scripts/transcribe_audio.py.

Use Cases

  • Meeting Intelligence: Automatically transcribe hour-long meeting recordings to generate searchable text logs for your team.
  • Content Creation: Convert voice memos or podcast raw files into draft articles or blog posts.
  • Accessibility: Generate transcripts for audio files to assist hearing-impaired users.
  • Data Analysis: Extract keywords and topics from customer service audio records to identify support trends.
  • Historical Archiving: Process large libraries of recorded interviews into standardized text formats.

Example Prompts

  1. "Transcribe the interview audio located at ./audios/interview_01.mp3 and summarize the key technical requirements discussed."
  2. "Please process the meeting recording https://example.com/daily_scrum.wav using the flash transcription model and provide a transcript with sentence-level timestamps."
  3. "Transcribe the attached voice note using the async long-file worker and save the raw API JSON response to the output directory."

Tips & Limitations

  • Choosing the Right Model: Use qwen3-asr-flash for fast responses on short snippets. For anything exceeding several minutes, always opt for qwen3-asr-flash-filetrans to handle the asynchronous job queue effectively.
  • Language Support: While the model is highly capable in multiple languages, always provide the language_hints parameter when you know the input language to ensure maximum accuracy.
  • Environment Security: Keep your DASHSCOPE_API_KEY private and never hardcode it into scripts shared in public repositories.
  • Output Management: Ensure your storage path is correctly configured in output/alicloud-ai-audio-asr/ to keep your workspace organized and prevent overwriting critical transcripts.

Metadata

Author@cinience
Stars3562
Views0
Updated2026-03-29
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-cinience-alicloud-ai-audio-asr": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#asr#transcription#qwen#audio-to-text#speech-recognition
Safety Score: 4/5

Flags: network-access, file-write, file-read, external-api