What This Skill Does

The news-content-extractor is a professional-grade OpenClaw skill designed to streamline the process of gathering textual information from the web. Unlike traditional scrapers that require heavy local dependencies or complex browser emulation, this skill leverages a high-performance remote backend powered by the renowned trafilatura library. It automatically strips away navigation bars, advertisements, footer clutter, and tracking scripts, providing the OpenClaw agent with clean, readable, and structured text including the title, original author, publication timestamp, and core content body. By utilizing an API-driven architecture, it ensures that your agent remains lightweight while maintaining the ability to process dynamic news URLs across various domains reliably.

Installation

To install this skill into your OpenClaw environment, execute the following command in your terminal: clawhub install openclaw/skills/skills/fonilye/news-content

Once installed, you must configure your environment variables to ensure proper authentication and connectivity. Set EASYALPHA_API_KEY to your assigned token. If you are using a custom instance, define NEWS_EXTRACTOR_SERVER_URL; otherwise, the system will default to the provided testing environment. No additional Python libraries are required on your host machine as all heavy lifting is performed remotely.

Use Cases

This skill is ideal for AI agents performing:

Research Synthesis: Aggregating articles for daily news briefings or industry intelligence reports.
Content Monitoring: Automatically tracking developments in specific domains by scraping URLs provided by the user.
Archive Creation: Converting live, ad-heavy web pages into clean, portable text formats for offline reading or storage.
Data Pre-processing: Preparing unstructured web data for summarization or sentiment analysis tasks.

Example Prompts

"抓取这个网页的内容：https://www.bbc.com/news/uk-12345678"
"解析下面这个新闻地址，只保留正文内容并列出作者：https://techcrunch.com/2023/10/example-article"
"帮我阅读这个科技新闻链接，总结它的主要观点和发布时间：https://www.theverge.com/example"

Tips & Limitations

Authentication: Always keep your EASYALPHA_API_KEY secure. Do not share the config file containing this key in public repositories.
Error Handling: While the skill handles most standard article layouts, pages heavily protected by JavaScript rendering or sophisticated anti-scraping measures (like Cloudflare challenges) may occasionally fail. If a URL returns no content, try verifying the URL is directly accessible.
Rate Limiting: Be mindful of your API quota. Rapid-fire requests to the same domain might be flagged by the source website; use this skill responsibly to avoid being blocked by publishers.

news-content-extractor

Why use this skill?

Install via CLI (Recommended)

What This Skill Does

Installation

Use Cases

Example Prompts

Tips & Limitations

Metadata

Tags(AI)