Chinese NLP Toolkit
Specialized natural language processing for Chinese text. Covers segmentation (jiaba), sentiment analysis, keyword extraction, text summarization, tone detection, readability scoring, and format conversion (simplified/traditional, pinyin annotation). Use when processing, analyzing, or transforming Chinese text content.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/371166758-qq/chinese-nlp-toolkitChinese NLP Toolkit
Process and analyze Chinese text with specialized NLP capabilities.
Core Capabilities
1. Text Segmentation (分词)
Chinese has no word boundaries. Segmentation is the foundation of all Chinese NLP.
Approach: Use rule-based heuristics when no library is available:
- Dictionary matching (maximum forward/backward matching)
- Context-aware: "南京市长江大桥" → ["南京市", "长江大桥"] not ["南京", "市长", "江大桥"]
- Domain-specific terms should be added as custom dictionary entries
Common Ambiguities:
| Text | Wrong Split | Correct Split |
|---|---|---|
| 雨伞 | 雨/伞 | 雨伞 (compound) |
| 结婚的和尚未结婚的 | 结婚/的/和尚/未/结婚/的 | 结婚/的/和/尚未/结婚/的 |
| 项目部 | 项目/部 | 项目部 (compound) |
2. Sentiment Analysis (情感分析)
Beyond positive/negative — Chinese sentiment is nuanced:
Intensity levels: 强烈负面 < 偏负面 < 中性 < 偏正面 < 强烈正面
Chinese-specific signals:
- Rhetorical questions often indicate negative sentiment: "这也算好?"
- Sarcasm markers: "呵呵", "厉害了", "也是醉了", "你开心就好"
- Intensifiers: "非常", "特别", "简直了", "超级"
- Diminishers: "还行吧", "马马虎虎", "凑合"
Emoji contribution (critical for social media):
- 😊👍❤️ = positive amplification
- 😤👎💔 = negative amplification
- 🙄🙄🙄 = sarcasm/disdain (intensity scales with repetition)
3. Keyword Extraction (关键词提取)
For Chinese text, prioritize:
- Noun phrases (名词短语)
- Domain-specific terminology
- Named entities (人名、地名、机构名)
Method: TF-IDF adapted for Chinese + positional weighting (first/last sentences carry more weight in Chinese writing).
4. Text Summarization (文本摘要)
Chinese-specific rules:
- Summarize to 20-30% of original length
- Preserve key numbers, names, and claims
- Chinese articles often "bury the lead" — the conclusion may be more important than the introduction
- Extract key sentences using positional + keyword scoring
5. Readability Scoring (可读性评分)
Rate Chinese text on a 1-10 scale considering:
- Average sentence length (characters per sentence)
- Vocabulary difficulty (HSK level estimate)
- Clause density ( commas per sentence)
- Use of classical Chinese elements
- Technical jargon density
| Score | Level | Target Audience |
|---|---|---|
| 1-3 | Easy | General public |
| 4-6 | Moderate | Educated readers |
| 7-8 | Hard | Domain experts |
| 9-10 | Very Hard | Academic specialists |
6. Format Conversion
| Conversion | Example |
|---|---|
| Simplified → Traditional | 体验 → 體驗 |
| Traditional → Simplified | 體驗 → 体验 |
| Chinese → Pinyin | 你好 → nǐ hǎo |
| Chinese → Zhuyin | 你好 → ㄋㄧˇ ㄏㄠˇ |
Workflow
When Processing Chinese Text:
- Detect variant: Simplified (简体) or Traditional (繁体)?
- Segment: Break into meaningful units
- Analyze: Apply the requested analysis type(s)
- Report: Present results with Chinese annotations
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-371166758-qq-chinese-nlp-toolkit": {
"enabled": true,
"auto_update": true
}
}
}Related Skills
Chinese Social Media Content Forge
Generate platform-native content for Chinese social media (Xiaohongshu/Little Red Book, WeChat Official Accounts, Douyin scripts, Bilibili descriptions). Handles style transfer, hashtag optimization, emoji usage patterns, and platform-specific formatting. Use when creating content for Chinese audiences, adapting English content for Chinese platforms, or batch-generating social media posts.
Qf Weather
Skill by 371166758-qq
Midjourney Prompt Architect
Generate detailed, creative, and optimized prompts for Midjourney and other AI image generation tools (Stable Diffusion, DALL-E, Flux). Covers style specification, composition, lighting, camera parameters, and negative prompting. Use when creating image generation prompts, refining visual concepts, or building prompt templates for batch generation.
Nature-Style Academic Writer
Transform academic text into publication-quality prose matching Nature journal standards. Covers grammar correction, academic tone elevation, clarity improvement, and style consistency. Use when refining manuscripts, grant proposals, abstracts, or any formal academic writing that needs to meet top-tier journal standards.
Qf Content Repurpose
Skill by 371166758-qq