ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified communication Safety 4/5

aliyun-asr

Pure Aliyun ASR skill for voice message transcription, supports multiple channels including Feishu

Why use this skill?

Convert voice messages to text automatically with the Aliyun ASR skill for OpenClaw. High-performance, secure, and compatible with Feishu, Telegram, and WhatsApp.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/jixsonwang/aliyun-asr
Or

What This Skill Does

The Aliyun ASR skill is a specialized, lightweight voice-to-text integration for OpenClaw. Designed with a strict focus on utility, this skill serves as the bridge between raw audio input and AI processing. Unlike bloated multimedia tools that attempt to mix audio generation with transcription, the Aliyun ASR skill performs one task with high efficiency: converting spoken word messages into text. It integrates seamlessly with OpenClaw’s message pipeline, intercepting voice files from platforms like Feishu, Telegram, and WhatsApp, transcribing them, and passing the resulting text to your connected AI model. By removing the need for local processing, it leverages Alibaba Cloud's robust NLS (Natural Language Service) infrastructure to ensure high-fidelity transcription even in noisy environments.

Installation

Installation is streamlined through the ClawHub system. Execute the following command in your terminal to fetch the latest version:

clawhub install openclaw/skills/skills/jixsonwang/aliyun-asr

Once installed, you must create the configuration file at /root/.openclaw/aliyun-asr-config.json. Insert your AliCloud RAM credentials (Access Key ID, Secret, and AppKey). After saving, ensure you protect the file's permissions with chmod 600 /root/.openclaw/aliyun-asr-config.json to prevent unauthorized access to your cloud credentials. No further manual integration steps are required; OpenClaw will automatically detect the skill and begin processing audio inputs.

Use Cases

This skill is perfect for users who prefer dictation over typing in fast-paced professional or personal environments. Common use cases include:

  • Transcribing long, complex verbal instructions sent via Feishu into actionable text for AI analysis.
  • Converting voice notes from mobile devices into documentation for project management tools.
  • Enabling accessibility for users who prefer verbal communication while on the move.
  • Automating the processing of voice messages into searchable text logs within OpenClaw archives.

Example Prompts

  1. [User sends a voice message] -> *"Could you please summarize the key action items from the morning stand-up meeting mentioned in this audio?"
  2. [User sends a voice message] -> *"Draft a professional email reply to this client request based on the tone of this voice note."
  3. [User sends a voice message] -> *"Translate the spoken content in this message to English and provide a bulleted list of the technical requirements discussed."

Tips & Limitations

  • Security: Always use a RAM sub-account with limited permissions rather than your root account. The skill is designed to prevent data storage on your local machine, keeping your interactions private.
  • Compatibility: This skill supports MP3, WAV, OGG, FLAC, AMR, and OPUS formats. Ensure your communication platform's output is compatible.
  • Limitations: This skill does not perform TTS (Text-to-Speech). It will not generate audio files. If you require an AI that speaks back, you will need to install a separate voice synthesis skill.

Metadata

Stars1947
Views0
Updated2026-03-04
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-jixsonwang-aliyun-asr": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#asr#speech-to-text#transcription#aliyun#voice-processing
Safety Score: 4/5

Flags: network-access, file-read, external-api