ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

aimlapi-voice

Transcribe audio files (ogg, mp3, wav, etc.) using AIMLAPI. Use when the user provides audio messages or local audio files. Provides a reliable Python script with retries and polling.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/aimlapihello/aiml-voice
Or

What This Skill Does

The aimlapi-voice skill serves as a high-performance bridge between OpenClaw and the AIMLAPI speech-to-text infrastructure. It is designed to handle the complexity of asynchronous audio processing by abstracting the tasks of file queuing, MIME-type negotiation, and result polling into a single, reliable Python execution flow. By leveraging industry-standard models like Whisper, this skill ensures that audio input from various sources—ranging from voice memos and interview recordings to ambient noise clips—is converted into accurate, machine-readable text transcripts with minimal configuration overhead.

Installation

To integrate this skill into your local environment, ensure you have the OpenClaw CLI tool installed. Run the following command in your terminal:

clawhub install openclaw/skills/skills/aimlapihello/aiml-voice

After installation, you must configure your authentication credentials. The skill relies on the AIMLAPI_API_KEY environment variable. You can export this directly in your shell profile (e.g., export AIMLAPI_API_KEY="your_key_here") or provide it via the CLI argument --apikey-file when executing the transcription task. Ensure Python 3 is installed and available in your system path.

Use Cases

  • Transcription Automation: Convert hours of voice memos or meeting recordings into text for indexing and searchability.
  • Content Creation: Extract raw audio from video projects to generate initial screenplay drafts or accessibility transcripts.
  • Data Ingestion: Process user-submitted audio files from messaging platforms within an automated pipeline.
  • Research Support: Automate the transcribing of field interviews, reducing manual effort during qualitative research phases.

Example Prompts

  1. "Transcribe the voice message I just uploaded to the downloads folder using the medium whisper model."
  2. "Please process the audio recording from the board meeting and save the output to meeting_transcript.txt."
  3. "Convert the audio file 'client_interview.ogg' to text and output the result to my logs."

Tips & Limitations

To ensure success, always verify that your audio files are not corrupted and are in a supported format before initiating the task. While the skill defaults to the #g1_whisper-medium model, advanced users may experiment with other available models via the --model flag to balance speed against accuracy. Be aware that the max-wait parameter is critical for long audio files; if your file is particularly long, you may need to increase the --max-wait limit beyond the default 300 seconds to prevent the skill from timing out before the API returns the result. Lastly, always keep your API keys secure and rotate them periodically to maintain account integrity.

Metadata

Stars4473
Views0
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-aimlapihello-aiml-voice": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#transcription#audio#speech-to-text#whisper#voice-recognition
Safety Score: 4/5

Flags: file-read, file-write, external-api, code-execution