xiaohongshu-extract
Extract metadata from Xiaohongshu (XHS) share or discovery URLs by parsing window.__INITIAL_STATE__ and returning note details. Use when asked to fetch XHS page content, note metadata, video info, or engagement stats from a public XHS link.
Why use this skill?
Efficiently extract metadata from Xiaohongshu links using the OpenClaw xiaohongshu-extract skill. Get titles, engagement stats, and video data in structured JSON format.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/jovijovi/xiaohongshu-extractWhat This Skill Does
The xiaohongshu-extract skill provides a robust mechanism for parsing metadata from Xiaohongshu (XHS) URLs. By programmatically accessing the target URL and parsing the window.__INITIAL_STATE__ object embedded within the page source, this skill extracts rich information including note titles, descriptions, engagement metrics, user profiles, and video stream details. It is designed to act as a data bridge, turning unstructured web page content into a clean, structured JSON format that is perfect for downstream analysis, content aggregation, or automated reporting. The tool supports advanced formatting options, such as outputting flattened data records or custom JSON error reports, making it highly flexible for both individual researchers and automated pipeline integrations.
Installation
To integrate this skill into your environment, use the OpenClaw command-line interface. Run the following command in your terminal:
clawhub install openclaw/skills/skills/jovijovi/xiaohongshu-extract
Ensure that you have the necessary Python environment dependencies configured to execute the script effectively, as the skill relies on web scraping libraries to retrieve data from XHS servers.
Use Cases
- Content Aggregation: Collect engagement data (likes, collections, shares) for a specific set of XHS notes to track performance over time.
- Metadata Analysis: Automatically extract tags and descriptions to perform sentiment analysis or keyword mapping for viral marketing trends.
- Asset Archiving: Retrieve video stream URLs and user information to facilitate the building of an offline media archive or content portfolio.
- Research: Quickly extract data from URLs provided by users to answer questions about specific notes without having to manually parse page source code.
Example Prompts
- "Analyze this note URL: https://www.xiaohongshu.com/explore/xxxxxx and give me the engagement stats and the author's nickname."
- "Can you extract the video URL and technical specs from this XHS link? I need to know the resolution and duration."
- "Fetch the full metadata for this XHS note and provide it as a flattened JSON object for my spreadsheet input."
Tips & Limitations
- URL Format: Always prefer discovery URLs over share URLs for higher reliability. If the script fails to parse the initial state, verify that the link is public and accessible.
- Rate Limiting: Be aware that excessive scraping may trigger anti-bot measures on the XHS platform. Use this tool responsibly within the platform's terms of service.
- Dynamic Content: Some pages may load content dynamically via JavaScript. If you receive an error, ensure the URL is valid and the page is currently active.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-jovijovi-xiaohongshu-extract": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-write, file-read, code-execution