ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified communication Safety 4/5

walkie-talkie

Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.

Why use this skill?

Transform WhatsApp into a voice-to-voice AI assistant. Use the walkie-talkie skill for real-time transcription and voice notes.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/rubenfb23/walkie-talkie-vigo
Or

What This Skill Does

The walkie-talkie skill for OpenClaw transforms your WhatsApp interactions from static text messaging into a dynamic, voice-driven experience. It bridges the gap between text-based AI processing and human conversational habits by enabling real-time voice-to-voice communication. When a user sends an audio file, the skill intercepts it, utilizes local transcription tools to convert the speech to text, allows the OpenClaw agent to process that intent, and finally generates a synthetic voice response via local text-to-speech (TTS) engines to send back as an audio note. This creates a seamless, low-latency, and hands-free conversational loop that feels like a natural walkie-talkie conversation.

Installation

To integrate this skill into your OpenClaw environment, execute the following command in your terminal: clawhub install openclaw/skills/skills/rubenfb23/walkie-talkie-vigo Ensure you have the required dependencies, specifically ffmpeg, whisper-cpp, and sherpa-onnx-tts, installed and accessible in your system path to handle the local audio processing tasks efficiently.

Use Cases

This skill is perfect for scenarios where typing is inconvenient or impossible, such as while driving, walking, or cooking. It is also highly effective for users who prefer verbal articulation over text for brainstorming sessions, quick updates, or informal chats. It significantly reduces the friction of interacting with an AI agent in mobile settings, providing a more intuitive and personal connection to the agent.

Example Prompts

  1. "Activa modo walkie-talkie"
  2. "Hablemos por voz, ¿cuál es el plan para hoy?"
  3. "Oye, envíame un mensaje de voz explicando qué es la inteligencia artificial."

Tips & Limitations

To ensure optimal performance, keep your environment quiet to improve transcription accuracy. The system is designed for a Real-Time Factor (RTF) of less than 0.5, meaning it should respond near-instantaneously. However, complex background noise may occasionally affect the whisper-cpp transcription quality. Remember that while the skill prioritizes voice, it always sends a redundant text message to ensure clarity and accessibility, allowing you to read the transcript if the audio environment is noisy. Ensure your system has sufficient RAM to keep the TTS models loaded for faster inference.

Metadata

Author@rubenfb23
Stars1133
Views1
Updated2026-02-18
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-rubenfb23-walkie-talkie-vigo": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#whatsapp#voice-assistant#speech-to-text#tts#hands-free
Safety Score: 4/5

Flags: file-read, file-write