faster-whisper-gpu
High-performance local speech-to-text transcription using Faster Whisper with NVIDIA GPU acceleration. Transcribe audio files locally without sending data to external services.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/felipeoff/faster-whisper-gpu🎙️ Faster Whisper GPU
High-performance local speech-to-text transcription using Faster Whisper with NVIDIA GPU acceleration.
✨ Features
- 🚀 GPU Accelerated: Uses NVIDIA CUDA for blazing-fast transcription
- 🔒 100% Local: No data leaves your machine. Complete privacy.
- 💰 Free Forever: No API costs. Run unlimited transcriptions.
- 🌍 Multilingual: Supports 99 languages with automatic detection
- 📁 Multiple Formats: Input: MP3, WAV, FLAC, OGG, M4A. Output: TXT, SRT, JSON
- 🎯 Multiple Models: From tiny (fast) to large-v3 (most accurate)
- 🎬 Subtitle Generation: Create SRT files with word-level timestamps
📋 Requirements
Hardware
- NVIDIA GPU with CUDA support (recommended: 4GB+ VRAM)
- Or CPU-only mode (slower but works on any machine)
Software
- Python 3.8+
- NVIDIA drivers (for GPU support)
- CUDA Toolkit 11.8+ or 12.x
🚀 Quick Start
Installation
# Install dependencies
pip install faster-whisper torch
# Verify GPU is available
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
Basic Usage
# Transcribe an audio file (auto-detects GPU)
python transcribe.py audio.mp3
# Specify language explicitly
python transcribe.py audio.mp3 --language pt
# Output as SRT subtitles
python transcribe.py audio.mp3 --format srt --output subtitles.srt
# Use larger model for better accuracy
python transcribe.py audio.mp3 --model large-v3
🔧 Advanced Usage
Command Line Options
python transcribe.py <audio_file> [options]
Options:
--model {tiny,base,small,medium,large-v1,large-v2,large-v3}
Model size to use (default: base)
--language LANG Language code (e.g., 'pt', 'en', 'es'). Auto-detect if not specified.
--format {txt,srt,json,vtt}
Output format (default: txt)
--output FILE Output file path (default: stdout)
--device {cuda,cpu} Device to use (default: cuda if available)
--compute_type {int8,int8_float16,int16,float16,float32}
Computation precision (default: float16)
--task {transcribe,translate}
Task: transcribe or translate to English (default: transcribe)
--vad_filter Enable voice activity detection filter
--vad_parameters MIN_DURATION_ON,MIN_DURATION_OFF
VAD parameters as comma-separated values
--condition_on_previous_text
Condition on previous text (default: True)
--initial_prompt PROMPT
Initial prompt to guide transcription
--word_timestamps Include word-level timestamps (for SRT/JSON)
--hotwords WORDS Comma-separated hotwords to boost recognition
Examples
Portuguese Transcription with SRT Output
python transcribe.py meeting.mp3 --language pt --format srt --output meeting.srt
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-felipeoff-faster-whisper-gpu": {
"enabled": true,
"auto_update": true
}
}
}Related Skills
Ant Design Skill
Skill by felipeoff
stripe-cli
Stripe CLI operations for local development, webhook testing, fixture-based event simulation, API inspection, and sandbox resource management. Use when installing or verifying stripe CLI, logging in, forwarding webhook events (`stripe listen --forward-to`), triggering test events (`stripe trigger`), replaying/resending events, tailing request logs, or performing safe subscription/checkout debugging in Stripe sandbox environments.
Sonarqube Analyzer
Skill by felipeoff