ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified ai models Safety 4/5

image-understanding

使用智谱AI的GLM-4V-Flash免费多模态API理解图片内容。当用户需要理解图片内容、描述图片、识别图中物体时使用此skill。

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/andyzwp/image-read
Or

What This Skill Does

The image-understanding skill acts as a powerful visual perception module for OpenClaw. Powered by the Zhipu AI GLM-4V-Flash model, this tool allows the AI agent to "see" and analyze image inputs provided by the user. Whether you are working with photos of documents, biological slides, complex diagrams, or everyday objects, this skill extracts context and generates meaningful text-based descriptions, effectively bridging the gap between visual inputs and structured intelligence.

Installation

To integrate this skill into your environment, use the OpenClaw command-line interface. Run the following command in your terminal:

clawhub install openclaw/skills/skills/andyzwp/image-read

Ensure you have a valid Zhipu AI API key. Register at https://bigmodel.cn/ to obtain your key, then set it as an environment variable named ZHIPU_API_KEY to allow the skill to authenticate and make API calls securely.

Use Cases

  • Scientific Analysis: Quickly identify components in biological images or lab result screenshots.
  • Data Extraction: Convert information from photos of charts, receipts, or handwritten notes into machine-readable text.
  • Content Description: Generate detailed captions for images for accessibility or cataloging purposes.
  • Technical Debugging: Analyze screenshots of code or system UI errors to understand what went wrong.

Example Prompts

  1. "Look at this uploaded photo of the plant leaves; can you tell me what kind of pest might be causing these yellow spots?"
  2. "I am attaching a screenshot of a spreadsheet. Please summarize the key financial trends shown in the chart."
  3. "Describe this image in detail and identify all the pieces of furniture visible in the room."

Tips & Limitations

  • Optimization: For the best performance and accuracy, ensure your images are resized to 1024x1024 pixels or less.
  • File Formats: JPG format is highly recommended for stability. PNG files might occasionally trigger compatibility issues during processing.
  • Cost: The underlying GLM-4V-Flash model is currently free, but please remain aware of rate limits imposed by the provider.
  • Privacy: As this skill interacts with an external API, ensure sensitive personal or corporate data is filtered before uploading images to the model for analysis.

Metadata

Author@andyzwp
Stars4473
Views1
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-andyzwp-image-read": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#computer-vision#multimodal#image-analysis#zhipu-ai#ocr
Safety Score: 4/5

Flags: file-read, external-api