ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified ai models Safety 4/5

vlmrun-cli-skill

Use the VLM Run CLI (`vlmrun`) to interact with Orion visual AI agent. Process images, videos, and documents with natural language. Triggers: image understanding/generation, object detection, OCR, video summarization, document extraction, image generation, visual AI chat, 'generate an image/video', 'analyze this image/video', 'extract text from', 'summarize this video', 'process this PDF'.

Why use this skill?

Use the VLM Run CLI skill to process images, videos, and documents with Orion AI. Enable advanced object detection, OCR, and media generation in OpenClaw.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/spillai/vlmrun-cli-skill
Or

What This Skill Does

The vlmrun-cli-skill provides a direct interface to the Orion visual AI agent, enabling powerful multimodal processing directly from your terminal. By leveraging the vlmrun binary, this skill allows OpenClaw to analyze, interpret, and generate content from a variety of file types, including images, videos, and PDFs. It serves as a bridge for complex visual reasoning tasks, such as object detection, document data extraction, video summarization, and generative media creation, all through simple natural language commands.

Installation

To enable this skill within your OpenClaw environment, execute the following command: clawhub install openclaw/skills/skills/spillai/vlmrun-cli-skill

Prerequisites include having the vlmrun CLI installed via uv or pip. Ensure that you have set your VLMRUN_API_KEY in your environment variables to authenticate with the Orion service. You can optionally configure VLMRUN_BASE_URL and VLMRUN_CACHE_DIR to tailor the behavior to your specific infrastructure.

Use Cases

This skill is highly effective for developers, data analysts, and content creators. Typical use cases include:

  • Automated Data Entry: Extracting structured information from invoices, receipts, or legal contracts.
  • Media Analysis: Generating descriptive summaries for long-form video content or identifying objects within surveillance/test footage.
  • Creative Workflows: Rapidly iterating on image and video generation prompts directly within a terminal-based workflow.
  • Quality Assurance: Comparing "before and after" image states to detect visual regressions in UI testing.

Example Prompts

  1. "Analyze the attached invoice.pdf and return the total amount and merchant name in JSON format."
  2. "Look at these two images: photo1.jpg and photo2.jpg. Describe the key visual differences between them."
  3. "Summarize the key discussion points from this meeting video and create a list of action items."

Tips & Limitations

  • Performance: For massive video files, processing time may vary; consider using the --no-stream flag if dealing with unstable network connections.
  • Caching: The default cache directory is ~/.vlmrun/cache/artifacts/. Regularly prune this directory to prevent disk bloat during heavy usage.
  • Model Selection: Experiment with vlmrun-orion-1:fast for quick lookups and vlmrun-orion-1:pro for high-accuracy extraction tasks.

Metadata

Author@spillai
Stars1015
Views0
Updated2026-02-15
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-spillai-vlmrun-cli-skill": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#visual-ai#multimodal#cli#image-analysis#video-processing
Safety Score: 4/5

Flags: network-access, file-write, file-read, external-api