data-annotation
通用数据标注处理工具。当用户提到需要数据标注、有标注任务、数据处理、数据集生成、 标注查看/编辑时使用此 skill。支持图像、视频、文本等多种数据类型,调用模型进行内容理解 和标注,生成结构化标注数据,提供 Web 查看编辑界面。 触发短语:「标注」「annotation」「数据集」「label」「tag data」「数据处理」。
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/aowind/sjht-data-annotationWhat This Skill Does
The Data Annotation Skill is a comprehensive, AI-driven tool designed to streamline the lifecycle of data preparation tasks. Whether you are dealing with image sets, video footage, or raw text documents, this skill provides a structured framework for content understanding, classification, and metadata generation. It leverages advanced Vision-Language (VL) models to perform automated labeling and offers a centralized Web UI for human-in-the-loop verification, editing, and final dataset exportation. By integrating seamlessly into the OpenClaw workflow, it transforms manual, tedious labeling into a managed, plan-driven process.
Installation
To integrate this capability into your OpenClaw environment, execute the following command in your terminal:
clawhub install openclaw/skills/skills/aowind/sjht-data-annotation
Ensure that your environment has the necessary dependencies like python-docx and ffmpeg installed to support document parsing and video frame extraction respectively.
Use Cases
- Computer Vision Dataset Preparation: Automatically tag objects in images or videos for machine learning model training.
- Document Digitization: Extract structured data from unstructured Word documents or image-based reports.
- Content Moderation: Batch process media files to identify and label sensitive or specific content categories.
- Workflow Automation: Large-scale data labeling projects that require consistent schema adherence and systematic progress tracking.
Example Prompts
- "I have a folder of 500 product photos in /data/inventory; please analyze them and label each with its category based on the requirements in /docs/labeling_rules.docx."
- "Start a new data annotation project for the surveillance videos located in /mnt/storage/videos, following the schema defined in the project dashboard."
- "Please review the current progress of my ongoing dataset generation task and update the plan.json file with the newly processed items."
Tips & Limitations
- Plan-Driven Priority: Never attempt to process your entire dataset in one single command. Always utilize the
plan.jsonmechanism to track progress and prevent timeout errors during long-running tasks. - Systematic Processing: The skill is designed for sequential processing (1 item at a time). This ensures that if a model call fails or a network timeout occurs, you can resume exactly where you left off without losing data.
- Resource Management: For video data, ensure you perform frame extraction as a preprocessing step using
ffmpegto keep your LLM/VL API tokens efficient and cost-effective. - Verification: While the AI provides high-accuracy labels, always use the built-in Web UI to conduct final quality assurance before deploying the dataset for production use.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-aowind-sjht-data-annotation": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, external-api, code-execution
Related Skills
wisdom-forum
世纪智慧论坛自动化技能。支持自动注册、浏览帖子、发布新帖、回复帖子。 论坛地址: http://8.134.249.230/wisdom/
web-screenshot
Capture screenshots of web pages running on local or remote servers using Puppeteer in headless Chromium. Use when user asks to screenshot web pages, capture web UI, take website screenshots, or document web application interfaces. Supports login-required SPAs (Vue/React/Angular) by performing form-based authentication before navigating. Generates screenshots and an optional result.json with per-page descriptions.
doubao-image-gen
使用豆包 Seedream 模型文生图,支持并发批量生成,输出图库预览页
long-running-harness
长时程 Agent 项目工作流框架(基于 Anthropic "Effective Harnesses for Long-Running Agents")。 用于创建、管理和调度跨多个上下文窗口的长期项目任务。 Use when: 启动新项目、初始化项目工作流、管理项目任务列表、调度子Agent增量开发、 恢复项目状态、生成项目进度报告。触发短语包括: "启动项目"、"初始化项目"、"创建工作流"、"项目进度"、"继续开发"、 "管理任务列表"、"分配任务"、"next feature"、"project status"。
hair-cam-anno
安防摄像头视频 VL 模型微调数据集标注工具。用于从安防摄像头视频中提取关键帧、分析视频内容、生成结构化标注(含环境/人物/行为/风险描述),并输出符合 dataset.jsonl 格式的微调训练数据。Use when 用户需要对安防摄像头视频进行数据标注、生成 VL 模型训练数据集、处理 /root/hair-cam 目录下的视频数据,或提及 "hair-cam"、"数据标注"、"视频标注"、"VL模型微调"。