discord-voice
Real-time voice conversations in Discord voice channels with Claude AI
Why use this skill?
Enable real-time voice conversations in Discord with OpenClaw. This plugin supports AI-powered speech-to-text, TTS, and voice activity detection for seamless agent interactions.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/thiagoruss0/discord-voicetwhtmWhat This Skill Does
The discord-voice skill for OpenClaw transforms your Discord server into an interactive, real-time voice environment powered by Claude. By bridging the gap between Discord’s voice channels and advanced AI, this skill allows users to hold natural, spoken conversations with the agent. It leverages high-performance components like Voice Activity Detection (VAD) to identify when a user is speaking, transcribes audio in real-time using industry-leading models like Whisper or Deepgram, processes the intent through the Claude agent, and streams the response back via high-quality text-to-speech providers like OpenAI or ElevenLabs. It is designed for seamless integration, including advanced features like barge-in support, which allows the AI to immediately stop speaking if a user interrupts, mirroring human conversation patterns. Whether you are hosting a collaborative meeting or need a personal assistant available on-demand in a private voice channel, discord-voice delivers a robust, low-latency audio experience.
Installation
To get started, first ensure the OpenClawCLI is installed. Next, install essential system dependencies: ffmpeg, build-essential, and python3. On Ubuntu, use 'sudo apt-get install ffmpeg build-essential python3'. Once system dependencies are met, run 'clawdhub install discord-voice'. Finally, update your 'clawdbot.json' file to include the necessary API configurations for your chosen STT and TTS providers, ensuring your Discord bot has the mandatory 'Connect', 'Speak', and 'Use Voice Activity' permissions enabled in the Discord Developer Portal.
Use Cases
- Live Meeting Assistant: Transcribe voice meetings and ask the agent to summarize points or clarify technical jargon in real-time.
- Interactive AI Companions: Create a voice-activated bot that members can interact with in a 'help' or 'lounge' voice channel for information and entertainment.
- Language Practice: Use the agent to hold conversation practice in various languages, with the AI providing instant feedback or correction.
- Task Automation: Trigger complex agent workflows via voice commands while working on other tasks, removing the need for manual text input.
Example Prompts
- "OpenClaw, join the voice channel 'General' and tell me what the status of the current project is."
- "Listen to our conversation and generate a bulleted summary of all assigned action items at the end of the call."
- "Summarize the last five minutes of our discussion and identify any conflicting viewpoints that were mentioned."
Tips & Limitations
- Latency: For the lowest possible latency, prioritize the use of Deepgram WebSocket over standard REST-based STT providers.
- Voice Activity: If the agent triggers too often or too rarely, adjust the 'vadSensitivity' setting in your configuration file to match your environment's noise level.
- Barge-in: Ensure your network connection is stable; high ping can occasionally cause delays in the 'barge-in' stop signal, causing the bot to finish its current sentence before processing the interruption.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-thiagoruss0-discord-voicetwhtm": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, external-api
Related Skills
n8n
Manage n8n workflows and automations via API. Use when working with n8n workflows, executions, or automation tasks - listing workflows, activating/deactivating, checking execution status, manually triggering workflows, or debugging automation issues.
Read WeChat local data from SQLite databases. Supports listing contacts, chat sessions, searching messages, and viewing favorites. Use when the user needs to access their own WeChat data stored locally. Requires access to WeChat data directory. Read-only operations only.
coding-agent
Run Codex CLI, Claude Code, OpenCode, or Pi Coding Agent via background process for programmatic control.
veo3-gen
Generate and stitch short videos via Google Veo 3.x using the Gemini API (google-genai). Use when you need to create video clips from prompts (ads, UGC-style clips, product demos) and want a reproducible CLI workflow (generate, poll, download MP4, optionally stitch multiple segments).
jira
Jira API integration with managed OAuth. Search issues with JQL, create and update issues, manage projects and transitions. Use this skill when users want to interact with Jira issues, projects, or workflows. For other third party apps, use the api-gateway skill (https://clawhub.ai/byungkyu/api-gateway).