cliproxy-media
Analyze images (jpg, png, gif, webp) and PDFs via CLIProxyAPI — a Claude Max proxy that routes requests through your subscription at zero extra cost. Use this skill whenever you need to analyze, describe, or extract information from an image or photo ("analyze image", "describe photo", "what is in this picture"), read or summarize a PDF document ("read PDF", "summary of this document"), or process any media file via a CLIProxy-compatible endpoint ("process media via proxy", "cliproxy vision", "cliproxy media"). NEVER use the built-in `image` or `pdf` tools when using CLIProxyAPI — they fall back to direct Anthropic API which requires separate credits. Use this skill instead for all vision and document analysis tasks.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/bencoremans/cliproxy-mediaWhat This Skill Does
The cliproxy-media skill is a powerful vision and document analysis tool designed for OpenClaw agents. It serves as a dedicated interface for the CLIProxyAPI, allowing you to process images (JPG, PNG, GIF, WEBP) and PDF documents by routing requests through your Claude Max subscription. By using this skill instead of built-in standard tools, you ensure that your requests are handled through your personal proxy configuration, effectively bypassing the need for separate Anthropic API credits and ensuring cost-effective media analysis. It is specifically built for tasks requiring visual understanding, content summarization, and extraction of data from complex file formats.
Installation
To integrate this skill into your environment, use the OpenClaw command-line interface. Run the following command in your terminal:
clawhub install openclaw/skills/skills/bencoremans/cliproxy-media
After installation, you must configure your environment to point to your active CLIProxy instance. Set the environment variable CLIPROXY_URL to your endpoint, for example: export CLIPROXY_URL=http://localhost:8317/v1/messages. Ensure your container or host is reachable by the agent.
Use Cases
This skill is perfect for scenarios requiring deep insight into visual or document-based information. Use it to extract text from receipts, summarize multi-page financial reports provided as PDFs, compare UI design mockups, interpret medical scans, or automate data entry from screenshots. It is also highly effective for collaborative workflows where you need to interpret visual cues or documentation during a technical session.
Example Prompts
- "Analyze this receipt image and extract the total cost, date, and merchant name into a JSON format."
- "Read the attached PDF summary and provide a bulleted list of the top five actionable insights mentioned in the report."
- "Look at these two images of the dashboard and describe the layout differences between the mobile and desktop versions."
Tips & Limitations
Always remember to use the array notation for system prompts, as simple string notation is silently ignored by the CLIProxyAPI. The skill is highly efficient but does not support office files like DOCX or XLSX; please convert these to PDFs before processing. Additionally, while the model is excellent at visual reasoning, it cannot process audio or video files—please use dedicated tools like Whisper for transcription or alternative services for media stream analysis. Always verify your CLIPROXY_MODEL setting to ensure you are utilizing the specific Claude version that best fits your performance and cost requirements.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-bencoremans-cliproxy-media": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, external-api