ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 5/5

mlx-stt

Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.

Why use this skill?

Transcribe audio files locally on your Mac with the MLX STT skill. Fast, accurate, and private speech-to-text powered by GLM-ASR-Nano-2512. No API keys required.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/guoqiao/mlx-stt
Or

What This Skill Does

The mlx-stt skill brings powerful, local Speech-To-Text (ASR) capabilities to your Apple Silicon machine. By leveraging the Apple MLX framework, it utilizes the efficient GLM-ASR-Nano-2512 model to convert audio files into text entirely offline. This means you do not need to rely on external cloud providers, expensive API keys, or privacy-compromising third-party transcription services. Whether you are dealing with voice memos, interview recordings, or meeting transcripts, this tool processes everything directly on your local hardware.

Installation

To get started, ensure you are running a macOS environment with Apple Silicon. Use the following command in your terminal via the OpenClaw CLI to install the skill:

clawhub install openclaw/skills/skills/guoqiao/mlx-stt

Once installed, the system will use the included install.sh script to configure necessary dependencies. This process automatically checks for and installs ffmpeg (for audio format conversion), uv (a fast Python package installer), and the mlx_audio library. The initial setup ensures that all components are correctly linked to your path, allowing the agent to execute transcription commands seamlessly.

Use Cases

  • Privacy-First Transcription: Transcribe sensitive internal meetings or personal voice notes without data ever leaving your device.
  • Content Creation: Quickly convert audio interviews or podcast clips into text drafts for blog posts or articles.
  • Accessibility: Create instant subtitles or searchable logs for local video/audio files.
  • Offline Productivity: Perfect for working in environments without internet access or when you need to avoid cloud latency.

Example Prompts

  1. "Transcribe the audio file located at ~/Documents/meetings/project-sync.mp3 using the mlx-stt tool."
  2. "Convert my recent lecture recording /Users/name/recordings/lecture_01.wav to text and save it as a markdown file."
  3. "Please run the local speech-to-text on this audio file: /volumes/data/interview_final.m4a."

Tips & Limitations

  • Initial Latency: The first time you execute the command, the system will download the machine learning model. This might take a few moments depending on your network speed.
  • Hardware Requirements: This skill is specifically optimized for Apple Silicon (M-series chips). It will not function on Intel-based Macs.
  • Audio Formats: Thanks to ffmpeg integration, most common audio formats are supported, but extremely large files should be segmented for better performance.
  • Accuracy: While highly efficient and accurate for general speech, performance may vary depending on background noise or speaker clarity.

Metadata

Author@guoqiao
Stars2387
Views2
Updated2026-03-09
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-guoqiao-mlx-stt": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#speech-to-text#local-ai#asr#mlx#transcription
Safety Score: 5/5

Flags: file-read, file-write, code-execution