ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified media Safety 4/5

cliproxy-media

Analyze images (jpg, png, gif, webp) and PDFs via CLIProxyAPI — a Claude Max proxy that routes requests through your subscription at zero extra cost. Use this skill whenever you need to analyze, describe, or extract information from an image or photo ("analyze image", "describe photo", "what is in this picture"), read or summarize a PDF document ("read PDF", "summary of this document"), or process any media file via a CLIProxy-compatible endpoint ("process media via proxy", "cliproxy vision", "cliproxy media"). NEVER use the built-in `image` or `pdf` tools when using CLIProxyAPI — they fall back to direct Anthropic API which requires separate credits. Use this skill instead for all vision and document analysis tasks.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/bencoremans/cliproxy-media
Or

What This Skill Does

The cliproxy-media skill is a powerful vision and document analysis tool designed for OpenClaw agents. It serves as a dedicated interface for the CLIProxyAPI, allowing you to process images (JPG, PNG, GIF, WEBP) and PDF documents by routing requests through your Claude Max subscription. By using this skill instead of built-in standard tools, you ensure that your requests are handled through your personal proxy configuration, effectively bypassing the need for separate Anthropic API credits and ensuring cost-effective media analysis. It is specifically built for tasks requiring visual understanding, content summarization, and extraction of data from complex file formats.

Installation

To integrate this skill into your environment, use the OpenClaw command-line interface. Run the following command in your terminal:

clawhub install openclaw/skills/skills/bencoremans/cliproxy-media

After installation, you must configure your environment to point to your active CLIProxy instance. Set the environment variable CLIPROXY_URL to your endpoint, for example: export CLIPROXY_URL=http://localhost:8317/v1/messages. Ensure your container or host is reachable by the agent.

Use Cases

This skill is perfect for scenarios requiring deep insight into visual or document-based information. Use it to extract text from receipts, summarize multi-page financial reports provided as PDFs, compare UI design mockups, interpret medical scans, or automate data entry from screenshots. It is also highly effective for collaborative workflows where you need to interpret visual cues or documentation during a technical session.

Example Prompts

  1. "Analyze this receipt image and extract the total cost, date, and merchant name into a JSON format."
  2. "Read the attached PDF summary and provide a bulleted list of the top five actionable insights mentioned in the report."
  3. "Look at these two images of the dashboard and describe the layout differences between the mobile and desktop versions."

Tips & Limitations

Always remember to use the array notation for system prompts, as simple string notation is silently ignored by the CLIProxyAPI. The skill is highly efficient but does not support office files like DOCX or XLSX; please convert these to PDFs before processing. Additionally, while the model is excellent at visual reasoning, it cannot process audio or video files—please use dedicated tools like Whisper for transcription or alternative services for media stream analysis. Always verify your CLIPROXY_MODEL setting to ensure you are utilizing the specific Claude version that best fits your performance and cost requirements.

Metadata

Stars4473
Views0
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-bencoremans-cliproxy-media": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#vision#ocr#pdf#claude#media
Safety Score: 4/5

Flags: file-read, external-api