speech-to-text
Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation, multi-language, timestamps. Use for: meeting transcription, subtitles, podcast transcripts, voice notes. Triggers: speech to text, transcription, whisper, audio to text, transcribe audio, voice to text, stt, automatic transcription, subtitles generation, transcribe meeting, audio transcription, whisper ai
Why use this skill?
Transcribe audio to text with OpenClaw using Whisper models. High accuracy, 99+ languages, timestamping, and translation support via inference.sh integration.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/okaris/speech-to-textWhat This Skill Does
The speech-to-text skill for OpenClaw provides high-fidelity audio transcription using state-of-the-art models like Fast Whisper Large V3 and Whisper V3 Large. By integrating the inference.sh CLI, the skill enables users to seamlessly convert spoken language into written text. It supports over 99 languages and offers advanced capabilities such as timestamp generation, language detection, and even automatic translation to English. This skill is engineered to handle diverse audio sources, including podcasts, voice notes, research interviews, and meeting recordings, providing accurate, structured data that is easy to search, index, or archive. By leveraging the inference.sh infrastructure, it delivers performance-optimized transcription directly within your workflow.
Installation
To integrate this skill into your environment, use the OpenClaw management CLI. Run the following command in your terminal:
clawhub install openclaw/skills/skills/okaris/speech-to-text
Ensure that you have the inference.sh CLI configured beforehand by running 'infsh login'. This ensures that the skill can authenticate against the inference engine to pull the necessary models and process your audio files efficiently.
Use Cases
- Corporate Meetings: Automatically generate searchable records of board meetings or team syncs.
- Content Creation: Effortlessly transcribe podcast episodes for show notes or blog posts.
- Accessibility: Provide accurate subtitles for video content to make audio media inclusive.
- Voice Productivity: Quickly convert dictated voice notes into organized, actionable text documentation.
- Multilingual Research: Translate and transcribe interviews conducted in foreign languages to streamline analysis.
Example Prompts
- "Transcribe this meeting recording located at https://audio-source.mp3 and provide me with a full transcript including timestamps."
- "Can you take this French interview from https://interview.mp3 and translate it into English for my report?"
- "Generate a subtitle file for this video https://video.mp4 using the Whisper V3 Large model for maximum accuracy."
Tips & Limitations
- Model Selection: Choose 'Fast Whisper' for high-speed batch processing, or 'Whisper V3 Large' if your priority is the highest possible word-error-rate accuracy.
- Audio Quality: Ensure source audio is clear and free of extreme background noise; while Whisper is robust, high-quality input significantly improves punctuation and technical term accuracy.
- Network Dependence: Because this skill utilizes the inference.sh API, you will need a stable internet connection to upload your audio files and retrieve the generated text results.
- Formatting: The output is returned as structured JSON, making it ideal for piping into other OpenClaw workflows or automated database entries.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-okaris-speech-to-text": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, external-api
Related Skills
content-repurposing
Content atomization — turn one piece of content into many formats. Covers blog-to-thread, blog-to-carousel, podcast-to-blog, video-to-quotes, and more. Use for: content marketing, social media, multi-platform distribution, content strategy. Triggers: content repurposing, repurpose content, content atomization, content recycling, one to many content, multi platform content, cross post, adapt content, reformat content, blog to thread, blog to video, podcast to blog, content multiplication
product-changelog
Product changelog and release notes that users actually read. Covers categorization, user-facing language, visuals, and distribution. Use for: release notes, changelogs, product updates, feature announcements, versioning. Triggers: changelog, release notes, product update, version notes, what's new, feature announcement, product changelog, update log, release announcement, version release, product release, ship notes
logo-design-guide
Logo design principles and AI image generation best practices for creating logos. Covers logo types, prompting techniques, scalability rules, and iteration workflows. Use for: brand identity, startup logos, app icons, favicons, logo concepts. Triggers: logo design, create logo, brand logo, logo generation, ai logo, logo maker, icon design, brand mark, logo concept, startup logo, app icon logo
product-photography
AI product photography with studio lighting, lifestyle shots, and packshot conventions. Covers angles, backgrounds, shadow types, hero shots, and e-commerce image requirements. Use for: product photos, e-commerce images, Amazon listings, packshots, lifestyle photography. Triggers: product photography, product photo, packshot, e-commerce photography, product shot, product image, studio photography, lifestyle product, amazon product photo, product listing image, hero shot, product mockup, commercial photography
newsletter-curation
Newsletter curation with content sourcing, editorial structure, and subscriber growth strategies. Covers issue formatting, link roundups, commentary style, and sending cadence. Use for: email newsletters, link roundups, weekly digests, curated content, creator newsletters. Triggers: newsletter, email newsletter, newsletter curation, weekly digest, link roundup, curated newsletter, newsletter writing, newsletter format, subscriber growth, newsletter strategy, content curation, newsletter template