xiaohongshu-scraper
小红书内容爬取和整理。用于搜索小红书笔记、提取详细内容(正文、评论、图片)、生成整理好的 Markdown 文档。当用户要求搜索小红书、查找小红书攻略、整理小红书内容时使用。
Why use this skill?
Efficiently scrape, parse, and export Xiaohongshu notes. Features include image downloading, OCR text recognition, and automated Markdown generation for AI agents.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/ty-teo/xiaohongshu-scraperWhat This Skill Does
The xiaohongshu-scraper is a specialized automation tool designed for OpenClaw agents to interface with Xiaohongshu (Little Red Book). It allows the agent to ingest content from platform URLs, providing a comprehensive data extraction service. Beyond simple scraping, it excels in downloading associated media assets (images), performing OCR (Optical Character Recognition) on visual content to extract text, and structuring the gathered information into clean, readable Markdown files or machine-readable JSON objects. It manages the lifecycle of local API services, ensuring that data retrieval is robust and persistent.
Installation
To integrate this skill into your environment, use the OpenClaw management command:
clawhub install openclaw/skills/skills/ty-teo/xiaohongshu-scraper
Ensure that you have the necessary dependencies for the scraper and the underlying XHS-Downloader library. You must also initialize the local API service by running:
./scripts/xhs-api-service.sh start
This service listens on port 5556 and acts as the bridge between your agent and the Xiaohongshu servers.
Use Cases
- Market Research: Aggregate top-performing posts for specific keywords or hashtags to identify current consumer trends.
- Content Curation: Automatically save high-quality guides, recipes, or travel itineraries from Xiaohongshu into your personal knowledge base in Markdown format.
- Competitor Monitoring: Track engagement metrics (likes, shares, comments) for specific authors or topics over time.
- Asset Management: Batch download original high-resolution images from posts for design or creative inspiration archives.
Example Prompts
- "Please search for 'best minimalist desk setups' on Xiaohongshu, extract the content of the top 3 posts, and save them as a single Markdown file in my research folder."
- "Go to this URL: [paste link here] and extract the text from the images using OCR, then summarize the core advice provided in the notes."
- "Can you check the current engagement count for these five Xiaohongshu notes and tell me which one has the highest share-to-like ratio?"
Tips & Limitations
- Performance: OCR processing is resource-intensive. If you only need the text content from the post description, use the
--no-ocrflag to significantly speed up retrieval times. - Service Status: If the agent reports connectivity issues, verify that the API service is running by executing
./xhs-api-service.sh status. - API Limits: Frequent mass scraping may trigger rate limits from the platform. Always respect the platform's terms of service and avoid aggressive, high-frequency requests.
- Data Privacy: Be mindful of privacy settings for specific notes; the scraper can only access publicly available information.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-ty-teo-xiaohongshu-scraper": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-write, file-read, external-api, code-execution
Related Skills
patent-assistant
专利交底书撰写与专利检索助手。帮助研发人员将技术方案转化为结构化交底书,并进行专利检索分析。当用户要求写专利、写交底书、专利检索、查新时使用。
nvidia-image-gen
Generate and edit images using NVIDIA FLUX models. Use when user asks to generate images, create pictures, edit photos, or modify existing images with AI. Supports text-to-image generation and image editing with text prompts.