mac-use
Control macOS GUI apps visually — take screenshots, click, scroll, type. Use when the user asks to interact with any Mac desktop application's graphical interface.
Why use this skill?
Automate any macOS app with the mac-use skill for OpenClaw. Features OCR-based GUI interaction, click, scroll, and type capabilities for seamless automation.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/kekejun/mac-useWhat This Skill Does
The mac-use skill provides OpenClaw with native, visual control over any macOS graphical application. By leveraging the Apple Vision framework for Optical Character Recognition (OCR), this skill transforms static application interfaces into interactive, addressable environments. When initiated, it generates a high-resolution screenshot, overlays unique numerical IDs on every text-based element, and provides a JSON mapping of these elements. This enables precise actions like clicking by number, typing text into fields via the system clipboard, scrolling through content, and executing keyboard shortcuts, mimicking human interaction with the desktop environment.
Installation
To integrate mac-use, run the following command within your environment: clawhub install openclaw/skills/skills/kekejun/mac-use. Ensure your environment has python3 installed via Homebrew and that the system has sufficient permissions to access the Accessibility API, which is required for controlling GUI elements. You must also install the required Python dependencies by running pip3 install --break-system-packages -r {baseDir}/requirements.txt inside the skill directory.
Use Cases
Use mac-use when your workflow involves legacy desktop software, internal enterprise tools that lack an API, or any GUI application that requires human-like navigation. It is perfect for automating repetitive administrative tasks like inputting data into desktop spreadsheets, filling out complex forms in non-web applications, or configuring application settings. If you need to open, close, or switch between windows and interact with native macOS menus or dialog boxes that are inaccessible via CLI, this skill is the primary solution.
Example Prompts
- "Open Microsoft Excel and type 'Project Quarterly Report' into the first cell, then save the file."
- "Find the preferences menu in my open Safari window and click on the 'Privacy' tab."
- "Scroll down the current Finder window until you see the 'Downloads' folder, then double-click it to open."
Tips & Limitations
- OCR Dependency: This skill relies on Apple Vision; for best results, ensure the target window is in focus and not hidden behind other apps.
- Fallback Strategy: When elements lack text (such as specific iconography or custom buttons), use the
clickcommand with X/Y coordinates based on the canvas origin. - Precision: Always verify actions by re-triggering the
screenshotcommand after an interaction to confirm the UI state changed as expected. - Performance: High-frequency interaction may require
sleepcommands to allow the UI to catch up with execution.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-kekejun-mac-use": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, code-execution