ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

augent

The audio & video layer for agents. 22 local MCP tools. No cloud, no API keys.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/augentdevs/augent
Or

Augent — Audio & Video Intelligence for AI Agents

Augent is an MCP server that gives your agent 22 tools for audio and video intelligence. Download from 1000+ sites via yt-dlp and aria2c, transcribe in 99 languages via faster-whisper, search by keyword or meaning via sentence-transformers, take notes, identify speakers via pyannote-audio, detect chapters, separate audio via Demucs v4, export clips, extract visual frames, record X/Twitter Spaces (requires user-configured auth token in ~/.augent/auth.json), and generate speech via Kokoro TTS. All processing runs locally. Downloads are saved to ~/Downloads/, notes and clips to ~/Desktop/, transcription memory to ~/.augent/memory/.

Config

{
  "mcpServers": {
    "augent": {
      "command": "augent-mcp"
    }
  }
}

If augent-mcp is not in PATH, use python3 -m augent.mcp as the command instead.

Install

Install via the ClawHub install button above, or use uv tool install augent for the base package or uv tool install "augent[all]" for all features. FFmpeg is required for audio processing.

Tools

Augent exposes 22 MCP tools:

Core

ToolDescription
download_audioDownload audio from video URLs at maximum speed. Supports YouTube, Vimeo, TikTok, Twitter/X, SoundCloud, and 1000+ sites. Uses aria2c multi-connection + concurrent fragments.
transcribe_audioFull transcription of any audio file with per-segment timestamps. Returns text, language, duration, and segments. Cached by file hash.
search_audioSearch audio for keywords. Returns timestamped matches with context snippets. Supports clip export.
deep_searchSemantic search — find moments by meaning, not just keywords. Uses sentence-transformers embeddings.
search_memorySearch across ALL stored transcriptions in one query. Keyword or semantic mode.
take_notesAll-in-one: download audio from URL, transcribe, and save formatted notes. Supports 5 styles: tldr, notes, highlight, eye-candy, quiz.
clip_exportExport a video clip from any URL for a specific time range. Downloads only the requested segment.

Analysis

ToolDescription
chaptersAuto-detect topic chapters with timestamps using embedding similarity.
search_proximityFind where two keywords appear near each other (e.g., "startup" within 30 words of "funding").
identify_speakersSpeaker diarization — identify who speaks when. No API keys required.
separate_audioIsolate vocals from music/noise using Meta's Demucs v4. Feed clean vocals into transcription.
batch_searchSearch multiple audio files in parallel. Ideal for podcast libraries or interview collections.

Utilities

Metadata

Stars4473
Views0
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-augentdevs-augent": {
      "enabled": true,
      "auto_update": true
    }
  }
}
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.