webchat-voice-full-stack
One-step full-stack installer for OpenClaw WebChat voice input with local speech-to-text. Deploys faster-whisper STT backend plus HTTPS/WSS WebChat proxy with mic button in one command. Push-to-Talk (hold to speak) and Toggle mode with keyboard shortcuts (Ctrl+Space PTT, Ctrl+Shift+M continuous recording). Real-time VU meter, localized UI (English, German, Chinese), interactive language selection during install. No recurring API costs, runs fully local after initial model download (~1.5 GB). Combines faster-whisper-local-service and webchat-voice-proxy. Keywords: voice input, microphone, WebChat, speech to text, STT, local transcription, whisper, full stack, one-click, voice button, push-to-talk, PTT, keyboard shortcut, i18n.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/neldar/webchat-voice-full-stackWebChat Voice Full Stack
Meta-installer that orchestrates two standalone skills in the correct order:
faster-whisper-local-service— local STT backend (HTTP on 127.0.0.1:18790)webchat-voice-proxy— HTTPS/WSS proxy + mic button for WebChat Control UI
Prerequisites
Both skills must be installed before running this meta-installer:
clawdhub install faster-whisper-local-service
clawdhub install webchat-voice-proxy
Additionally required on the system:
- Python 3.10+
gst-launch-1.0(GStreamer, from OS packages)- Internet access on first run (model download ~1.5 GB for
medium)
Deploy
bash scripts/deploy.sh
Optional overrides (passed through to downstream scripts):
VOICE_HOST=10.0.0.42 VOICE_HTTPS_PORT=8443 TRANSCRIBE_PORT=18790 WHISPER_LANGUAGE=auto bash scripts/deploy.sh
What this does (via downstream scripts)
This skill does not contain deployment logic itself. It calls deploy.sh from each sub-skill. Here is what those scripts do:
faster-whisper-local-service deploys:
- Creates Python venv at
$WORKSPACE/.venv-faster-whisper/ - Installs
faster-whisper==1.1.1via pip - Writes
transcribe-server.pyto$WORKSPACE/voice-input/ - Creates + enables systemd user service
openclaw-transcribe.service - Downloads model weights from Hugging Face on first run (~1.5 GB for medium)
webchat-voice-proxy deploys:
- Copies
voice-input.jsandhttps-server.pyto$WORKSPACE/voice-input/ - Injects
<script>tag into Control UIindex.html - Adds HTTPS origin to
gateway.controlUi.allowedOriginsinopenclaw.json - Creates + enables systemd user service
openclaw-voice-https.service - Installs gateway startup hook at
~/.openclaw/hooks/voice-input-inject/ - Auto-generates self-signed TLS cert on first run
For full details, security notes, and uninstall instructions, see each skill's SKILL.md.
Verify
bash scripts/status.sh
Uninstall
Uninstall each skill separately:
# Proxy (service, hook, UI injection, gateway config)
bash skills/webchat-voice-proxy/scripts/uninstall.sh
# Backend (service, venv)
systemctl --user stop openclaw-transcribe.service
systemctl --user disable openclaw-transcribe.service
rm -f ~/.config/systemd/user/openclaw-transcribe.service
systemctl --user daemon-reload
Notes
- This meta-skill is a convenience wrapper. All actual logic lives in the two sub-skills.
- Review both sub-skills' scripts before running if you haven't already.
- The
WORKSPACEandSKILLS_DIRpaths are configurable via environment variables (default:~/.openclaw/workspaceand~/.openclaw/workspace/skills).
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-neldar-webchat-voice-full-stack": {
"enabled": true,
"auto_update": true
}
}
}Related Skills
webchat-voice-proxy
Voice input and microphone button for OpenClaw WebChat Control UI. Adds a mic button to chat, records audio via browser MediaRecorder, transcribes locally via faster-whisper, and injects text into the conversation. Includes HTTPS/WSS reverse proxy, TLS cert management, and gateway hook for update safety. Fully local speech-to-text, no API costs. Real-time VU meter shows voice activity. Push-to-Talk (hold to speak) and Toggle mode (click start/stop), switchable via double-click. Keyboard shortcuts: Ctrl+Space PTT, Ctrl+Shift+M continuous recording. Localized UI (English, German, Chinese built-in, extensible). Keywords: voice input, microphone, WebChat, Control UI, speech to text, STT, local transcription, MediaRecorder, HTTPS proxy, voice button, mic button, push-to-talk, PTT, keyboard shortcut, i18n, localization.
Faster Whisper Local Service
Skill by neldar