ai-voice-chat
Hands-free AI voice conversations via AirPods or any Bluetooth headset. MLX-Whisper STT (Apple Silicon GPU, ~130ms) + hybrid LLM routing (local gemma3 for simple chat, cloud for complex) + Kokoro-ONNX TTS with sentence streaming. Auto-starts on headset connect, supports mid-conversation language switching. Simple conversations run fully local and free (~2.4s total latency). Complex queries route to cloud (~5s). Zero cost for voice processing — only cloud LLM API tokens for complex queries.
Why use this skill?
Experience hands-free AI conversations on macOS. Uses hybrid local/cloud routing for fast, free simple chat and powerful cloud-based complex task completion.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/bolander72/ai-voice-chatWhat This Skill Does
The ai-voice-chat skill provides a hands-free, low-latency conversational interface for OpenClaw. It leverages Apple Silicon acceleration to run sophisticated speech-to-text and text-to-speech models entirely on your device. By using a hybrid routing architecture, it intelligently decides whether to process a request locally using a gemma3:1b model (for instantaneous, free responses) or via the OpenClaw cloud API (for complex tasks like tool execution, web searching, or memory-reliant operations). This ensures the most efficient use of resources while maintaining a human-like flow of conversation.
Installation
To get started, first ensure you are on a macOS machine with Apple Silicon. Begin by pulling the local language model: ollama pull gemma3:1b. Navigate to your OpenClaw directory and run clawhub install openclaw/skills/skills/bolander72/ai-voice-chat. Run the setup script located at scripts/setup.sh to initialize the virtual environment and download the necessary audio models. For secure API token management, add your credentials to the macOS Keychain using security add-generic-password -s "voice-loop-openclaw-token" -w "YOUR_TOKEN_HERE". Finally, you can either manually run the script via scripts/voice_loop.py or use the provided airpods_watcher.py to trigger the voice interface automatically whenever you connect your Bluetooth headset.
Use Cases
- Hands-Free Productivity: Dictate notes or check your calendar while commuting or working without touching your keyboard.
- Smart Home Integration: Use the cloud-routed complex queries to control connected devices via voice commands.
- Quick Information Retrieval: Get instant answers for simple queries like unit conversions or quick trivia without waiting for cloud round-trips.
- Language Learning: Practice conversational skills with mid-conversation language switching capabilities.
Example Prompts
- "What is on my calendar for this afternoon and do I need to prepare any documents?"
- "How do I calculate the volume of a cylinder if I only have the diameter and height?"
- "Tell me a short, funny story about a robot trying to learn how to cook pasta."
Tips & Limitations
- Hardware Dependency: Performance relies heavily on Apple Silicon (M1–M4 chips) for the ~130ms transcription latency.
- Latency: While simple chats are fast (~2.4s), complex cloud requests will incur higher latency (~5.1s) due to external API processing.
- Router Reliability: If the router is unsure about a request's complexity, it defaults to the cloud. You can optimize this by keeping simple queries concise.
- Connectivity: Ensure your headset is correctly identified by the audio input system for the auto-watcher to trigger successfully.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-bolander72-ai-voice-chat": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-read, external-api
Related Skills
imessage-voice-reply
Send voice message replies in iMessage using local Kokoro-ONNX TTS. Generates native iMessage voice bubbles (CAF/Opus) that play inline with waveform — not file attachments. Use when receiving a voice message in iMessage and wanting to reply with voice, enabling voice-to-voice iMessage conversations, or sending audio responses. Zero cost — all TTS runs locally. Requires BlueBubbles channel configured in OpenClaw.
ratgdo32-disco
Control a ratgdo32 disco garage door opener via its local web API. Use when the user asks to open/close the garage, check garage status, toggle the garage light, check if a car is parked, enable/disable remotes, or anything involving the garage door. Supports door control, light, obstruction detection, vehicle presence (laser sensor), parking assist, motion, and remote lockout. Uses local network trust model (LAN-only, no internet exposure).