computer-use
Full desktop computer use for headless Linux servers. Xvfb + XFCE virtual desktop with xdotool automation. 17 actions (click, type, scroll, screenshot, drag, etc). Unlike OpenClaw's browser tool, operates at the X11 level so websites cannot detect automation. Includes VNC for live viewing.
Why use this skill?
Automate any desktop application on headless Linux servers with OpenClaw. Gain full X11 control via Xvfb and XFCE for reliable, invisible GUI interaction.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/ram-raghav-s/computer-useWhat This Skill Does
The computer-use skill provides OpenClaw with full GUI control over headless Linux environments. By leveraging Xvfb, XFCE, and xdotool, the agent gains the ability to interact with any desktop application as if it were a physical user sitting at a workstation. Unlike browser-based tools that are often detected or limited by DOM structure, this skill operates at the X11 display level, allowing for seamless automation of legacy software, desktop environments, and complex web interfaces that are resistant to standard web-scraping scripts.
Installation
To begin, ensure your server is prepared for a headless display. Run the setup script provided in the skill package: ./scripts/setup-vnc.sh. This handles the installation of Xvfb, the minimal XFCE desktop environment, and the configuration of x11vnc and noVNC for remote viewing. Services are configured to auto-start on boot, ensuring your automated workflows remain persistent across reboots.
Use Cases
This skill is ideal for automating tasks that lack accessible APIs or are incompatible with standard headless browser agents. Common use cases include high-fidelity GUI testing, interacting with non-web desktop software, performing multi-step workflows that require visual verification, and circumventing anti-bot protections on websites that identify browser-agent traffic. Because it supports live VNC viewing, human operators can monitor and intervene in the agent's actions in real-time.
Example Prompts
- "Look at the desktop, find the Firefox icon at the top left, double-click it, and then search for 'OpenClaw documentation'."
- "Open the text editor, type the current system logs into it, and save the file as log_dump.txt on the desktop."
- "Scroll down the current webpage by 20 units and take a screenshot to confirm the new footer information is visible."
Tips & Limitations
- The virtual display is fixed at 1024x768. Ensure your application layouts are compatible with this resolution.
- Always perform a screenshot action before clicking to ensure the UI has finished rendering.
- Click on input fields to ensure focus before typing text.
- When running long-running tasks, use the
waitaction to allow for UI transition animations. - Use
ctrl+Endandctrl+Hometo navigate long documents or pages efficiently within the virtual environment.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-ram-raghav-s-computer-use": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, code-execution