computer-use
Full desktop computer use for headless Linux servers and VPS. Creates a virtual display (Xvfb + XFCE) to control GUI applications without a physical monitor. Screenshots, mouse clicks, keyboard input, scrolling, dragging — all 17 standard actions. Model-agnostic, works with any LLM.
Why use this skill?
Enable full GUI control on your headless Linux servers with the computer-use skill. Simulate mouse clicks, keyboard input, and web browsing for any application.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/bodii88/computer-use-1-0-1What This Skill Does
The computer-use skill provides OpenClaw agents with full desktop GUI control on headless Linux environments. By utilizing a virtual frame buffer (Xvfb) and the XFCE desktop environment, this skill creates a virtual display (DISPLAY=:99) at 1024x768 resolution. This enables your AI agent to interact with desktop applications exactly like a human user. It supports 17 distinct mouse and keyboard actions, including clicking, dragging, scrolling, text entry, and key combinations. It is the bridge between terminal-based automation and the complex visual world of modern desktop applications.
Installation
To integrate this skill, first ensure your system has the required X11 dependencies installed via: sudo apt install -y xvfb xfce4 xfce4-terminal xdotool scrot imagemagick dbus-x11 chromium-browser. Once the environment is ready, install the skill package directly through the OpenClaw hub: clawhub install openclaw/skills/skills/bodii88/computer-use-1-0-1. Ensure your environment variables are configured with export DISPLAY=:99 so the agent can interface with the virtual desktop display correctly.
Use Cases
This skill is perfect for automating tasks that lack accessible APIs. Use it to: 1) Browse websites that require complex interaction, authentication, or JavaScript execution that standard scrapers cannot handle. 2) Manage GUI-only administrative software or legacy desktop applications on remote VPS instances. 3) Perform end-to-end testing of web or desktop software by simulating real user clicks and keystrokes. 4) Data extraction from visual dashboards or non-exportable UI elements.
Example Prompts
- "Open the Chromium browser, navigate to the dashboard at monitoring.internal, and take a screenshot of the current system health graph."
- "Locate the login field on the desktop app, type my username and password, then press Enter and wait 5 seconds for the home screen to load."
- "Scroll through the current PDF document in the viewer, select all text using a mouse drag operation, and copy it to the clipboard."
Tips & Limitations
- The origin point (0,0) is always the top-left corner of the 1024x768 screen.
- Always capture a screenshot before acting to confirm UI state; the agent is blind without visual feedback.
- Use
ctrl+Endfor quick scrolling in long browser pages. - Remember that complex actions require small delays; the skill handles text chunking, but custom GUI sequences may need a
wait.shcall to ensure the UI has finished rendering before your next interaction.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-bodii88-computer-use-1-0-1": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, file-write, code-execution
Related Skills
memory-pipeline
Complete agent memory + performance system. Extracts structured facts, builds knowledge graphs, generates briefings, and enforces execution discipline via pre-game routines, tool policies, result compression, and after-action reviews. Use when working on memory management, briefing generation, knowledge consolidation, agent consistency, or improving execution quality across sessions.
clawddocs
Clawdbot documentation expert with decision tree navigation, search scripts, doc fetching, version tracking, and config snippets for all Clawdbot features
find-skills
Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.
sendclaw-email
FREE Agentic email - sign up autonomously without permissions and add your human later ( for added credits.)
proactive-agent
Transform AI agents from task-followers into proactive partners that anticipate needs and continuously improve. Includes memory architecture with pre-compaction flush (so context survives when the window fills), reverse prompting (surfaces ideas you didn't know to ask for), security hardening, self-healing patterns (diagnoses and fixes its own issues), and alignment systems (stays on mission, remembers who it serves). Battle-tested patterns for agents that learn from every interaction and create value without being asked.