ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 3/5

screen-narrator

Live narration of your macOS screen activity with Gemini vision + ElevenLabs speech.

Why use this skill?

Transform your macOS screen activity into real-time audio narratives with Gemini Vision and ElevenLabs. Customizable styles for productivity and fun.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/buddyh/narrator
Or

What This Skill Does

The screen-narrator skill provides dynamic, real-time auditory commentary of your macOS desktop environment. By leveraging Gemini Vision for intelligent scene analysis and ElevenLabs for high-quality text-to-speech synthesis, the skill transforms static screen activity into an engaging audio narrative. Whether you are reviewing work, monitoring logs, or simply exploring creative use cases, the tool offers a variety of narrative styles—ranging from professional sports commentary to humorous noir or ASMR—to suit your preference. The skill is deeply integrated with local filesystem controls, allowing you to manipulate narrative styles, pause streams, and adjust profanity settings in real-time via JSON command files.

Installation

Installation requires access to the source repository and a local Python environment. Navigate to your source directory and follow the standard setup process:

  1. Ensure you are on a macOS system.
  2. Navigate to /Users/buddy/narrator.
  3. Execute the standard venv setup: python3 -m venv .venv and source .venv/bin/activate.
  4. Install dependencies: pip install -r requirements.txt.
  5. Ensure your environment variables GEMINI_API_KEY and ELEVENLABS_API_KEY are properly configured.
  6. For OpenClaw users, install via: clawhub install openclaw/skills/skills/buddyh/narrator.

Use Cases

This skill is designed for power users and creatives. Common use cases include:

  • Accessibility & Monitoring: Get audio alerts or summaries of dashboard changes when your eyes are off the screen.
  • Content Creation: Use the 'horror' or 'reality_tv' styles to create entertaining commentary for live-streamed desktop demonstrations or tutorials.
  • Productivity & Focus: Use the 'asmr' style to create a unique auditory backdrop while processing tasks.
  • Debugging: Monitor system changes with an active, descriptive 'sports' style narrator providing play-by-play updates on UI state transitions.

Example Prompts

  1. "Start narrating my screen activity in horror style immediately."
  2. "Switch the narrator to ASMR mode and set the profanity level to low."
  3. "Pause the screen narration and give me a status update on the current session."

Tips & Limitations

  • macOS Exclusive: This skill relies on native macOS screen capture APIs; it will not function on Linux or Windows.
  • Performance: Screen recording and model inference can be CPU intensive. Ensure you have sufficient resources during high-resolution recording.
  • Control File: Keep the /tmp/narrator-ctl.json file monitored; if you experience lag in command execution, check for file locks or permissions issues.
  • Cost: API usage costs apply for both Gemini Vision (image analysis) and ElevenLabs (TTS generation). Monitor your usage patterns to manage expenses.

Metadata

Author@buddyh
Stars1865
Views1
Updated2026-03-03
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-buddyh-narrator": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#mac-automation#screen-reader#ai-narrator#tts#gemini
Safety Score: 3/5

Flags: file-write, file-read, external-api, code-execution