ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 5/5

parakeet-stt

Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/carlulsoe/parakeet-stt
Or

What This Skill Does

The parakeet-stt skill integrates the high-performance NVIDIA Parakeet TDT 0.6B v3 model into your local OpenClaw environment. Unlike cloud-based transcription services that require sending sensitive audio data to external servers, this skill executes entirely on your CPU using the ONNX Runtime. It provides an OpenAI-compatible API that makes it a drop-in replacement for existing workflows, allowing for lightning-fast speech-to-text processing with minimal latency. It supports auto-detection across 25 major world languages, making it an incredibly versatile tool for global transcription needs.

Installation

You can install this skill directly via the ClawHub command line interface. Execute the following command in your terminal: clawhub install openclaw/skills/skills/carlulsoe/parakeet-stt

For the backend service, ensure Docker is running, as it is the recommended deployment method. Clone the source repository, then run docker compose up -d parakeet-cpu. If you prefer a native Python setup, navigate to the cloned directory and run pip install -r requirements.txt followed by starting the server with Uvicorn. Ensure your PARAKEET_URL environment variable is correctly configured to point to the service port (default is 5000).

Use Cases

  • Journalism and Research: Transcribe long-form interviews or lectures in seconds rather than minutes.
  • Content Creation: Generate subtitles (SRT/VTT) for video projects directly from audio files.
  • Privacy-First Workflows: Process sensitive voice notes, meetings, or legal recordings without the data ever leaving your local machine.
  • Automation: Build autonomous agents that react to voice commands or perform real-time analysis on local audio streams.

Example Prompts

  1. "Transcribe the meeting file at /data/recordings/meeting_01.mp3 and save the output as an SRT file for my video project."
  2. "Can you process the audio file in the current folder using the parakeet-stt model and provide me with the plain text transcription?"
  3. "Convert this voice note to text using the local Parakeet engine and give me a detailed summary with timestamps."

Tips & Limitations

  • CPU Optimization: Because this runs on your local CPU, ensure your machine has sufficient resources, especially when processing long audio files.
  • Language Support: While it handles 25 languages via auto-detection, ensure audio is clear for optimal accuracy.
  • API Compatibility: The service mimics the OpenAI API structure. If you are integrating this into custom scripts, you can use the official openai Python SDK by simply pointing the base_url to your local PARAKEET_URL.
  • Web UI: Don't forget that you can access a browser-based drag-and-drop interface by navigating to your local port, which is perfect for quick, ad-hoc transcriptions.

Metadata

Author@carlulsoe
Stars4072
Views1
Updated2026-04-13
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-carlulsoe-parakeet-stt": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#transcription#stt#local-ai#speech-to-text#onnx
Safety Score: 5/5

Flags: network-access, file-read