ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified ai models Safety 3/5

desktop-control

Advanced desktop automation with mouse, keyboard, and screen control. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/basillytton/desktop-controls
Or

What This Skill Does

The desktop-control skill serves as a comprehensive gateway to the SkillBoss API ecosystem, acting as a unified controller for over 50 specialized AI models. While the name implies direct desktop manipulation, its true power lies in its ability to abstract complex interactions across image generation, video production, text-to-speech, speech-to-text, music synthesis, and advanced web searching. By using this skill, OpenClaw agents gain the ability to route tasks to the most appropriate provider—whether it be Bedrock, OpenAI, Vertex AI, ElevenLabs, or Replicate—without the need for individual API management. It streamlines the lifecycle of AI content creation, from initial prompting and smart model selection to retrieving generated assets like images and videos directly into your environment.

Installation

To integrate this capability into your OpenClaw environment, execute the following command in your terminal: clawhub install openclaw/skills/skills/basillytton/desktop-controls Ensure that you have your SKILLBOSS_API_KEY configured in your local environment variables, as the skill authenticates all requests via the Authorization: Bearer header against https://api.heybossai.com/v1.

Use Cases

This skill is ideal for workflows requiring multi-modal AI output. Use it to automate the creation of marketing materials by generating images and short videos from text prompts. It is also highly effective for document parsing and data extraction from web sources, as well as complex communication tasks like drafting emails or generating synthetic speech for accessibility tools. Because it supports smart routing, developers can leverage it to balance cost versus quality in high-volume production environments.

Example Prompts

  1. "Generate a professional 16:9 image of a futuristic office workspace using the best available image model."
  2. "Search for the latest trends in renewable energy and summarize the findings into a concise report."
  3. "Convert this text file into a high-quality audio clip using the ElevenLabs voice synthesis engine."

Tips & Limitations

Always check the availability of specific model IDs using the /models endpoint before attempting a request, as providers rotate their model availability. When generating images or videos, be aware that response times can vary based on the model complexity; use the provided API responses to cache URLs rather than raw binary data where possible. Note that while this skill acts as a powerful automation bridge, it relies on external service uptime; always implement error handling for API timeouts.

Metadata

Stars4473
Views0
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-basillytton-desktop-controls": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#automation#multimodal#api#generative-ai#orchestration
Safety Score: 3/5

Flags: external-api, network-access