ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 2/5

computer-use

Full desktop computer use for headless Linux servers and VPS. Creates a virtual display (Xvfb + XFCE) to control GUI applications without a physical monitor. Screenshots, mouse clicks, keyboard input, scrolling, dragging — all 17 standard actions. Model-agnostic, works with any LLM.

Why use this skill?

Enable full GUI control on your headless Linux servers with the computer-use skill. Simulate mouse clicks, keyboard input, and web browsing for any application.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/bodii88/computer-use-1-0-1
Or

What This Skill Does

The computer-use skill provides OpenClaw agents with full desktop GUI control on headless Linux environments. By utilizing a virtual frame buffer (Xvfb) and the XFCE desktop environment, this skill creates a virtual display (DISPLAY=:99) at 1024x768 resolution. This enables your AI agent to interact with desktop applications exactly like a human user. It supports 17 distinct mouse and keyboard actions, including clicking, dragging, scrolling, text entry, and key combinations. It is the bridge between terminal-based automation and the complex visual world of modern desktop applications.

Installation

To integrate this skill, first ensure your system has the required X11 dependencies installed via: sudo apt install -y xvfb xfce4 xfce4-terminal xdotool scrot imagemagick dbus-x11 chromium-browser. Once the environment is ready, install the skill package directly through the OpenClaw hub: clawhub install openclaw/skills/skills/bodii88/computer-use-1-0-1. Ensure your environment variables are configured with export DISPLAY=:99 so the agent can interface with the virtual desktop display correctly.

Use Cases

This skill is perfect for automating tasks that lack accessible APIs. Use it to: 1) Browse websites that require complex interaction, authentication, or JavaScript execution that standard scrapers cannot handle. 2) Manage GUI-only administrative software or legacy desktop applications on remote VPS instances. 3) Perform end-to-end testing of web or desktop software by simulating real user clicks and keystrokes. 4) Data extraction from visual dashboards or non-exportable UI elements.

Example Prompts

  1. "Open the Chromium browser, navigate to the dashboard at monitoring.internal, and take a screenshot of the current system health graph."
  2. "Locate the login field on the desktop app, type my username and password, then press Enter and wait 5 seconds for the home screen to load."
  3. "Scroll through the current PDF document in the viewer, select all text using a mouse drag operation, and copy it to the clipboard."

Tips & Limitations

  • The origin point (0,0) is always the top-left corner of the 1024x768 screen.
  • Always capture a screenshot before acting to confirm UI state; the agent is blind without visual feedback.
  • Use ctrl+End for quick scrolling in long browser pages.
  • Remember that complex actions require small delays; the skill handles text chunking, but custom GUI sequences may need a wait.sh call to ensure the UI has finished rendering before your next interaction.

Metadata

Author@bodii88
Stars1100
Views1
Updated2026-02-17
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-bodii88-computer-use-1-0-1": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#automation#gui#headless#linux#robotics
Safety Score: 2/5

Flags: file-read, file-write, code-execution

Related Skills

memory-pipeline

Complete agent memory + performance system. Extracts structured facts, builds knowledge graphs, generates briefings, and enforces execution discipline via pre-game routines, tool policies, result compression, and after-action reviews. Use when working on memory management, briefing generation, knowledge consolidation, agent consistency, or improving execution quality across sessions.

bodii88 1100

clawddocs

Clawdbot documentation expert with decision tree navigation, search scripts, doc fetching, version tracking, and config snippets for all Clawdbot features

bodii88 1100

find-skills

Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.

bodii88 1100

sendclaw-email

FREE Agentic email - sign up autonomously without permissions and add your human later ( for added credits.)

bodii88 1100

proactive-agent

Transform AI agents from task-followers into proactive partners that anticipate needs and continuously improve. Includes memory architecture with pre-compaction flush (so context survives when the window fills), reverse prompting (surfaces ideas you didn't know to ask for), security hardening, self-healing patterns (diagnoses and fixes its own issues), and alignment systems (stays on mission, remembers who it serves). Battle-tested patterns for agents that learn from every interaction and create value without being asked.

bodii88 1100