siliconflow-vision
图片识别与分析工具。使用视觉大模型识别图片内容,输出详细客观的识别结果供主模型分析。当用户发图片时,主模型必须直接调用此 skill,然后基于识别结果进行分析和回答。支持 SiliconFlow(默认)、OpenAI、Anthropic 等多服务商。
Why use this skill?
Enhance OpenClaw with siliconflow-vision to extract text, analyze charts, and interpret images using top-tier vision models like Qwen and GPT-4o.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/lycohana/siliconflow-visionWhat This Skill Does
siliconflow-vision is a powerful image analysis agent skill designed for OpenClaw. It acts as the "eyes" of your AI assistant, converting raw visual data into structured, objective, and detailed descriptive output. Unlike generic image upload features, this skill is engineered to trigger deep analysis, specifically designed to extract text, identify screen elements, define layout structures, and interpret artistic or informational styles. It bridges the gap between raw pixel data and high-level reasoning by providing the main AI model with factual, noise-free visual information before any synthesis or interpretation occurs. With native support for SiliconFlow (default), OpenAI, and Anthropic, it ensures high compatibility and flexibility in choosing the underlying vision model.
Installation
To integrate this skill into your OpenClaw ecosystem, execute the following command in your terminal:
clawhub install openclaw/skills/skills/lycohana/siliconflow-vision
After installation, ensure your API keys are configured correctly by updating the config/default.json file or setting the corresponding environment variables (SILICONFLOW_API_KEY, OPENAI_API_KEY, or ANTHROPIC_API_KEY) to enable the desired service provider.
Use Cases
- Developer Support: Analyze screenshots of error logs, stack traces, or UI layout bugs in desktop/mobile apps.
- Content Creation: Interpret complex memes or visual humor by letting the model extract the literal elements, while the main agent handles the cultural context.
- Office Productivity: Digitize data from charts, extract text from scanned documents, or summarize information from meeting slides.
- Data Analysis: Extract numerical data points from infographics or business reports for further processing.
Example Prompts
- "[Image] Could you explain what is wrong with this Python code screenshot?"
- "[Image] Extract the main data points from this chart and summarize the trend for my weekly report."
- "[Image] Analyze this meme. What are the visual components and why does it have this specific satirical tone?"
Tips & Limitations
- Tip: Always specify the
-m smartflag for high-precision tasks like reading technical documentation or complex infographics to leverage the Qwen-VL-72B model. - Tip: For faster responses when dealing with simple screenshots, use the
-s(short) mode to reduce output tokens. - Limitation: The skill is designed for objective description; avoid asking it to perform high-level subjective analysis directly, as it performs best when the primary model handles the final synthesis and external research.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-lycohana-siliconflow-vision": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, external-api
Related Skills
feishu-send-message
通过 API 向飞书用户发送消息。当你需要通过手机号或任意用户 ID(open_id、user_id、union_id)向飞书用户发送消息时使用。自动尝试所有 ID 类型以找到有效的那个。 **新增**:消息长度指南和长内容多部分发送最佳实践!
moltbook-poster
Moltbook 代理社交网络工具集。用于发帖、评论、点赞、获取动态、管理私信等。发帖频率限制为每30分钟1篇,需要配置 configs/moltbook.json。
ssh
SSH 连接和管理远程服务器。使用 paramiko (Python) 库,支持在 Windows/Linux 上通过 SSH 执行远程命令、安装软件、查看日志等操作。