xdotool-control
Mouse and keyboard automation using xdotool. Use when clicking Chrome extension icons, typing into GUI apps, switching browser tabs, automating desktop UI, or running screenshot-verify-click loops without a browser relay. Triggers on: 'click extension icon', 'click coordinates', 'type in window', 'switch tab', 'automate mouse', 'screenshot and click', 'xdotool', 'desktop automation', 'GUI automation without relay'.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/jeremysommerfeld8910-cpu/xdotool-controlxdotool-control
Automate mouse, keyboard, and window operations on the Linux desktop. Primary use: clicking Chrome extension icons, interacting with GUI apps when browser CDP isn't connected.
Quick Start
# Find a window
xdotool search --name "Google Chrome"
# Click at screen coordinates
xdotool mousemove 1800 56 click 1
# Type text into focused window
xdotool type "hello world"
# Screenshot current state
scrot /tmp/snap.png
Core Patterns
1. Find + Focus + Click
# Find Chrome window, focus it, click at position
WIN=$(xdotool search --name "Google Chrome" | head -1)
xdotool windowactivate --sync "$WIN"
sleep 0.3
xdotool mousemove X Y click 1
2. Screenshot → Verify → Click Loop
Use this when you need to click an element but don't know its exact position:
bash ~/.openclaw/workspace/skills/xdotool-control/scripts/snap_verify_click.sh \
"Google Chrome" \ # Window name pattern
"extension_icon" \ # What to look for (label for your snap files)
1830 56 # Coordinates to click
Or use the full loop script for unknown positions:
bash ~/.openclaw/workspace/skills/xdotool-control/scripts/find_and_click.sh \
"Google Chrome" \
/tmp/target_icon.png \ # Template image to match (ImageMagick compare)
10 # Max attempts
3. Click Chrome Extension Icon
bash ~/.openclaw/workspace/skills/xdotool-control/scripts/click_extension.sh "OpenClaw"
# or
bash ~/.openclaw/workspace/skills/xdotool-control/scripts/click_extension.sh "Dawn"
This focuses Chrome and clicks the extensions puzzle-piece area, then scans for the named extension.
4. Tab Switching
# Switch to next tab
WIN=$(xdotool search --name "Google Chrome" | head -1)
xdotool windowactivate --sync "$WIN"
xdotool key ctrl+Tab
# Switch to specific tab (1-indexed)
xdotool key ctrl+2 # Tab 2
xdotool key ctrl+3 # Tab 3
# Open new tab
xdotool key ctrl+t
# Type a URL into address bar
xdotool key ctrl+l
sleep 0.2
xdotool type "https://example.com"
xdotool key Return
5. Type Into Window
WIN=$(xdotool search --name "Terminal" | head -1)
xdotool windowactivate --sync "$WIN"
sleep 0.2
xdotool type --clearmodifiers "command to type here"
xdotool key Return
6. Approve tmux Prompt (for Clawdy daemon)
SESSION=$(tmux ls | grep claude-session | head -1 | cut -d: -f1)
tmux send-keys -t "$SESSION" "Yes" Enter
Window Management
# List all windows with names
xdotool search --name "" | while read wid; do
name=$(xdotool getwindowname "$wid" 2>/dev/null)
[ -n "$name" ] && echo "$wid $name"
done | head -20
# Get window geometry (position + size)
xdotool getwindowgeometry $WIN_ID
# Move window to front
xdotool windowraise $WIN_ID
# Resize window
xdotool windowsize $WIN_ID 1280 800
# Move window
xdotool windowmove $WIN_ID 0 0
Screenshot Utilities
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-jeremysommerfeld8910-cpu-xdotool-control": {
"enabled": true,
"auto_update": true
}
}
}Related Skills
ai-collab
Multi-agent autonomous collaboration system for two OpenClaw agents working in parallel. Use when setting up agent-to-agent communication, running a daemon agent alongside a primary agent, coordinating tasks between Claude and GPT instances, or establishing a shared chat log and inbox protocol. Triggers on: 'set up agent collaboration', 'run two agents', 'agent daemon', 'multi-agent', 'Jim and Clawdy', 'secondary agent', 'agent handoff'.
skill-factory
Create, evaluate, improve, benchmark, and publish OpenClaw skills. Use when building a new skill from scratch, iterating on an existing skill, running evals to measure quality, comparing skill versions, or analyzing patterns across installed skills to synthesize new ones. Triggers on: 'create a skill', 'build a skill', 'make a skill', 'eval this skill', 'improve this skill', 'benchmark skill versions', 'analyze skill patterns', 'synthesize skill from patterns', 'package skill', 'publish skill'.