aimlapi-voice
Transcribe audio files (ogg, mp3, wav, etc.) using AIMLAPI. Use when the user provides audio messages or local audio files. Provides a reliable Python script with retries and polling.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/aimlapihello/aiml-voiceWhat This Skill Does
The aimlapi-voice skill serves as a high-performance bridge between OpenClaw and the AIMLAPI speech-to-text infrastructure. It is designed to handle the complexity of asynchronous audio processing by abstracting the tasks of file queuing, MIME-type negotiation, and result polling into a single, reliable Python execution flow. By leveraging industry-standard models like Whisper, this skill ensures that audio input from various sources—ranging from voice memos and interview recordings to ambient noise clips—is converted into accurate, machine-readable text transcripts with minimal configuration overhead.
Installation
To integrate this skill into your local environment, ensure you have the OpenClaw CLI tool installed. Run the following command in your terminal:
clawhub install openclaw/skills/skills/aimlapihello/aiml-voice
After installation, you must configure your authentication credentials. The skill relies on the AIMLAPI_API_KEY environment variable. You can export this directly in your shell profile (e.g., export AIMLAPI_API_KEY="your_key_here") or provide it via the CLI argument --apikey-file when executing the transcription task. Ensure Python 3 is installed and available in your system path.
Use Cases
- Transcription Automation: Convert hours of voice memos or meeting recordings into text for indexing and searchability.
- Content Creation: Extract raw audio from video projects to generate initial screenplay drafts or accessibility transcripts.
- Data Ingestion: Process user-submitted audio files from messaging platforms within an automated pipeline.
- Research Support: Automate the transcribing of field interviews, reducing manual effort during qualitative research phases.
Example Prompts
- "Transcribe the voice message I just uploaded to the downloads folder using the medium whisper model."
- "Please process the audio recording from the board meeting and save the output to meeting_transcript.txt."
- "Convert the audio file 'client_interview.ogg' to text and output the result to my logs."
Tips & Limitations
To ensure success, always verify that your audio files are not corrupted and are in a supported format before initiating the task. While the skill defaults to the #g1_whisper-medium model, advanced users may experiment with other available models via the --model flag to balance speed against accuracy. Be aware that the max-wait parameter is critical for long audio files; if your file is particularly long, you may need to increase the --max-wait limit beyond the default 300 seconds to prevent the skill from timing out before the API returns the result. Lastly, always keep your API keys secure and rotate them periodically to maintain account integrity.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-aimlapihello-aiml-voice": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, file-write, external-api, code-execution
Related Skills
aimlapi-embeddings
Generate text embeddings via AIMLAPI. Use for semantic search, clustering, or high-dimensional text representations with text-embedding-3-large and other models.
aimlapi-media-gen
Generate images or videos via AIMLAPI from prompts. Use when Codex needs reliable AI/ML API media generation with retries, explicit User-Agent headers, and async video polling.
aimlapi-safety
Content moderation and safety checks. Instantly classify text or images as safe or unsafe using AI guardrails.
aimlapi-llm-reasoning
Run AIMLAPI LLM and reasoning workflows through chat completions with retries, structured outputs, and explicit User-Agent headers. Use when Codex needs scripted prompting/reasoning calls against AIMLAPI models.
aimlapi-music
Generate high-quality music/songs via AIMLAPI. Supports Suno, Udio, Minimax, and ElevenLabs music models. Use when the user asks for music, songs, or soundtracks with specific lyrics or styles.