voice-chat
语音对话集成技能,支持双向语音交流。使用TTS和STT实现完整的语音对话功能。
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/fangkelvin/voice-chat-skill语音对话技能
实现完整的双向语音对话功能,支持语音输入和语音输出。
功能特性
✅ 已实现功能
-
文本转语音(TTS)
- 使用OpenClaw内置tts工具
- 支持中英文混合
- 实时音频生成
-
语音转文本(STT)
- 使用Python speech_recognition库
- 支持麦克风输入
- 多引擎支持(Google、Whisper等)
-
对话管理
- 自动语音检测
- 对话上下文保持
- 中断处理
🔧 技术架构
语音输入 → STT转换 → 文本处理 → AI响应 → TTS转换 → 语音输出
安装要求
必需组件
- Python 3.8+
- speech_recognition库
- pyaudio库(Windows需要额外安装)
可选组件
- Whisper - 更准确的本地STT
- ElevenLabs API - 高质量TTS
- OpenAI API - 云端STT
快速开始
1. 安装依赖
# 安装Python库
pip install SpeechRecognition pyaudio
# Windows pyaudio安装(如果失败)
pip install pipwin
pipwin install pyaudio
2. 基础语音对话脚本
# voice_chat.py
import speech_recognition as sr
import subprocess
import tempfile
import os
class VoiceChat:
def __init__(self):
self.recognizer = sr.Recognizer()
self.microphone = sr.Microphone()
def listen(self):
"""监听语音输入并转换为文本"""
with self.microphone as source:
print("🎤 请说话...")
audio = self.recognizer.listen(source)
try:
text = self.recognizer.recognize_google(audio, language='zh-CN')
print(f"📝 识别结果: {text}")
return text
except sr.UnknownValueError:
return "无法识别语音"
except sr.RequestError:
return "语音识别服务不可用"
def speak(self, text):
"""使用OpenClaw TTS朗读文本"""
# 调用OpenClaw tts工具
print(f"🗣️ 正在朗读: {text}")
# 这里可以集成OpenClaw tts工具
def conversation_loop(self):
"""对话循环"""
print("🎧 语音对话已启动,按Ctrl+C退出")
while True:
# 监听语音
user_input = self.listen()
if user_input and "退出" not in user_input:
# 生成响应(这里可以集成AI模型)
response = f"我听到你说: {user_input}"
# 语音输出
self.speak(response)
if __name__ == "__main__":
chat = VoiceChat()
chat.conversation_loop()
3. 集成OpenClaw TTS
def openclaw_tts(text, output_file="output.mp3"):
"""调用OpenClaw TTS工具"""
import subprocess
import json
# 创建临时文件
with tempfile.NamedTemporaryFile(mode='w', suffix='.json', delete=False) as f:
tts_request = {
"text": text,
"channel": "webchat"
}
json.dump(tts_request, f)
request_file = f.name
try:
# 调用tts工具(需要OpenClaw环境)
result = subprocess.run([
"node", "path/to/openclaw/tts-tool.js",
"--input", request_file,
"--output", output_file
], capture_output=True, text=True)
if result.returncode == 0:
print(f"✅ 语音文件已生成: {output_file}")
# 播放音频
subprocess.run(["start", output_file], shell=True)
else:
print(f"❌ TTS失败: {result.stderr}")
finally:
os.unlink(request_file)
高级配置
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-fangkelvin-voice-chat-skill": {
"enabled": true,
"auto_update": true
}
}
}Related Skills
finance-accounting
财务会计文书处理综合技能包 - 包含记账、对账、税务、报表等核心功能
tavily-search
Use Tavily API for real-time web search and content extraction. Use when: user needs real-time web search results, research, or current information from the web. Requires Tavily API key.
find-skills
Search and discover OpenClaw skills from various sources. Use when: user wants to find available skills, search for specific functionality, or discover new skills to install.
proactive-agent
Transform AI agents from task-followers into proactive partners that anticipate needs and continuously improve. Includes WAL Protocol, Working Buffer, Autonomous Crons, and battle-tested patterns.