mlx-stt
Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally.
Why use this skill?
Transcribe audio files locally on your Mac with the MLX STT skill. Fast, accurate, and private speech-to-text powered by GLM-ASR-Nano-2512. No API keys required.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/guoqiao/mlx-sttWhat This Skill Does
The mlx-stt skill brings powerful, local Speech-To-Text (ASR) capabilities to your Apple Silicon machine. By leveraging the Apple MLX framework, it utilizes the efficient GLM-ASR-Nano-2512 model to convert audio files into text entirely offline. This means you do not need to rely on external cloud providers, expensive API keys, or privacy-compromising third-party transcription services. Whether you are dealing with voice memos, interview recordings, or meeting transcripts, this tool processes everything directly on your local hardware.
Installation
To get started, ensure you are running a macOS environment with Apple Silicon. Use the following command in your terminal via the OpenClaw CLI to install the skill:
clawhub install openclaw/skills/skills/guoqiao/mlx-stt
Once installed, the system will use the included install.sh script to configure necessary dependencies. This process automatically checks for and installs ffmpeg (for audio format conversion), uv (a fast Python package installer), and the mlx_audio library. The initial setup ensures that all components are correctly linked to your path, allowing the agent to execute transcription commands seamlessly.
Use Cases
- Privacy-First Transcription: Transcribe sensitive internal meetings or personal voice notes without data ever leaving your device.
- Content Creation: Quickly convert audio interviews or podcast clips into text drafts for blog posts or articles.
- Accessibility: Create instant subtitles or searchable logs for local video/audio files.
- Offline Productivity: Perfect for working in environments without internet access or when you need to avoid cloud latency.
Example Prompts
- "Transcribe the audio file located at ~/Documents/meetings/project-sync.mp3 using the mlx-stt tool."
- "Convert my recent lecture recording /Users/name/recordings/lecture_01.wav to text and save it as a markdown file."
- "Please run the local speech-to-text on this audio file: /volumes/data/interview_final.m4a."
Tips & Limitations
- Initial Latency: The first time you execute the command, the system will download the machine learning model. This might take a few moments depending on your network speed.
- Hardware Requirements: This skill is specifically optimized for Apple Silicon (M-series chips). It will not function on Intel-based Macs.
- Audio Formats: Thanks to
ffmpegintegration, most common audio formats are supported, but extremely large files should be segmented for better performance. - Accuracy: While highly efficient and accurate for general speech, performance may vary depending on background noise or speaker clarity.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-guoqiao-mlx-stt": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, file-write, code-execution
Related Skills
mlx-audio-server
Local 24x7 OpenAI-compatible API server for STT/TTS, powered by MLX on your Mac.
dl
Download Video/Music from YouTube/Bilibili/X/etc.
url2pdf
Convert URL to PDF suitable for mobile reading.
uv-global
Provision and reuse a global uv environment for ad hoc Python scripts.
url2png
Convert URL to PNG suitable for mobile reading.