ms-qwen-vl
调用魔搭社区(ModelScope)Qwen3-VL 多模态 API 进行视觉解析。使用 OpenAI SDK 兼容方式调用,支持图片内容描述、OCR 文字提取、视觉问答、对象检测等功能。用户提到"魔搭"、"ModelScope"、"Qwen-VL"、"多模态视觉"、"解析图片"等关键词时应触发。
Why use this skill?
Integrate Qwen3-VL into OpenClaw for advanced image description, OCR, and visual Q&A. Leverage high-performance multimodal AI to analyze images and data.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/crocketc/ms-qwen-vlWhat This Skill Does
The ms-qwen-vl skill integrates the powerful Qwen3-VL multimodal model from ModelScope directly into your OpenClaw workflow. It empowers your agent with advanced visual perception, allowing it to interpret images, extract text via OCR, detect specific objects, and perform complex visual reasoning. By leveraging an OpenAI-compatible SDK, this skill ensures consistent and high-performance communication with ModelScope's inference endpoints, providing both a standard speed-optimized mode and a high-precision 235B model for granular tasks.
Installation
To enable this skill, run the following command in your terminal:
clawhub install openclaw/skills/skills/crocketc/ms-qwen-vl
Once installed, you must configure your environment variables. Copy the .env.example file to .env and provide your API key obtained from the ModelScope console. Set the MODELSCOPE_API_KEY variable to ensure the agent has authorization to access the visual inference services.
Use Cases
This skill is ideal for tasks requiring visual understanding, such as:
- Automated Data Entry: Using OCR to transcribe handwritten notes or scanned invoices into digital formats.
- Content Moderation & Analysis: Describing complex screenshots or analyzing visual data for reports.
- Visual Q&A: Asking questions about specific elements within a dashboard or a complex diagram.
- Asset Management: Detecting objects in images to categorize or tag assets effectively.
Example Prompts
- "Can you perform an OCR scan on this invoice image located at D:\Documents\invoice.jpg and extract the total amount?"
- "Describe the contents of this screenshot: C:\Users\Desktop\ui_design.png, and point out any alignment issues."
- "Look at this chart image https://example.com/data.png and explain the trend shown in the visual representation."
Tips & Limitations
- Input Handling: Always ensure the local file paths provided to the agent are correct; the underlying script automatically manages base64 conversion for optimal API transmission.
- Performance: Use the standard mode for general tasks to minimize latency. If you require deep logical reasoning or high-accuracy analysis, append the
--preciseflag to trigger the 235B model. - Security: Be mindful when uploading images containing sensitive information to external APIs; ensure your ModelScope privacy settings align with your data security requirements.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-crocketc-ms-qwen-vl": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read, external-api, code-execution