ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified productivity Safety 3/5

mac-use

Control macOS GUI apps visually — take screenshots, click, scroll, type. Use when the user asks to interact with any Mac desktop application's graphical interface.

Why use this skill?

Automate any macOS app with the mac-use skill for OpenClaw. Features OCR-based GUI interaction, click, scroll, and type capabilities for seamless automation.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/kekejun/mac-use
Or

What This Skill Does

The mac-use skill provides OpenClaw with native, visual control over any macOS graphical application. By leveraging the Apple Vision framework for Optical Character Recognition (OCR), this skill transforms static application interfaces into interactive, addressable environments. When initiated, it generates a high-resolution screenshot, overlays unique numerical IDs on every text-based element, and provides a JSON mapping of these elements. This enables precise actions like clicking by number, typing text into fields via the system clipboard, scrolling through content, and executing keyboard shortcuts, mimicking human interaction with the desktop environment.

Installation

To integrate mac-use, run the following command within your environment: clawhub install openclaw/skills/skills/kekejun/mac-use. Ensure your environment has python3 installed via Homebrew and that the system has sufficient permissions to access the Accessibility API, which is required for controlling GUI elements. You must also install the required Python dependencies by running pip3 install --break-system-packages -r {baseDir}/requirements.txt inside the skill directory.

Use Cases

Use mac-use when your workflow involves legacy desktop software, internal enterprise tools that lack an API, or any GUI application that requires human-like navigation. It is perfect for automating repetitive administrative tasks like inputting data into desktop spreadsheets, filling out complex forms in non-web applications, or configuring application settings. If you need to open, close, or switch between windows and interact with native macOS menus or dialog boxes that are inaccessible via CLI, this skill is the primary solution.

Example Prompts

  1. "Open Microsoft Excel and type 'Project Quarterly Report' into the first cell, then save the file."
  2. "Find the preferences menu in my open Safari window and click on the 'Privacy' tab."
  3. "Scroll down the current Finder window until you see the 'Downloads' folder, then double-click it to open."

Tips & Limitations

  • OCR Dependency: This skill relies on Apple Vision; for best results, ensure the target window is in focus and not hidden behind other apps.
  • Fallback Strategy: When elements lack text (such as specific iconography or custom buttons), use the click command with X/Y coordinates based on the canvas origin.
  • Precision: Always verify actions by re-triggering the screenshot command after an interaction to confirm the UI state changed as expected.
  • Performance: High-frequency interaction may require sleep commands to allow the UI to catch up with execution.

Metadata

Author@kekejun
Stars1776
Views1
Updated2026-03-02
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-kekejun-mac-use": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#mac-automation#gui-control#ocr#desktop-automation#accessibility
Safety Score: 3/5

Flags: file-write, file-read, code-execution