web-fetcher
Fetch web pages and extract readable content for AI use. Use when reading, summarizing, or crawling a specific URL or small set of URLs. Prefer low-friction URL-to-Markdown services first, then fall back to browser-based retrieval, search snippets, or cached/indexed copies when sites are protected by Cloudflare or similar bot checks.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/aurthes/aurthes-web-fetcherWhat This Skill Does
The web-fetcher skill is a robust, multi-stage retrieval agent designed to extract readable Markdown content from web pages. Unlike simple GET requests, this skill implements a reliability-first fallback strategy, prioritizing lightweight conversion services before attempting resource-heavy browser rendering or alternative search-based information gathering. It is designed to handle common hurdles such as Cloudflare bot detection, JavaScript-heavy rendering, and standard site structure inconsistencies, ensuring your AI agent receives clean, actionable text.
Installation
To integrate this skill into your environment, use the following command in your terminal:
clawhub install openclaw/skills/skills/aurthes/aurthes-web-fetcher
Ensure you have the necessary permissions to access network resources and, if required, the browser automation driver if you intend to utilize the full browser-fallback functionality.
Use Cases
This skill is ideal for tasks requiring high-fidelity web ingestion. Common use cases include: summarizing long-form articles or technical whitepapers, extracting structured data from public websites, scraping metadata from research portals, and navigating protected sites where traditional crawlers fail. It is particularly effective for users who need to aggregate information from multiple sources that vary in their technical accessibility and security configurations.
Example Prompts
- "Go to https://example.com/article-123, extract the main content, and summarize the key findings in a bulleted list."
- "Can you find the current submission guidelines and ISSN for the Journal of AI Research? Use the web-fetcher to pull from their homepage, and if that is blocked, search for official cached mirrors."
- "Fetch the page at https://blog.provider.com and extract the pricing table. If the page is hidden behind a script wall, use the browser-fallback to get the raw data."
Tips & Limitations
Always be mindful of the 'Core Rule': do not promise access to every site. While the fallback chain (Markdown converters -> Browser session -> Search engines) covers most public pages, sites with strict login walls or aggressive legal restrictions will remain inaccessible. If the fetch returns a security challenge screen, treat it as a signal to move to the next fallback level rather than retrying the same method. When working with partial data, the skill will return valid fields while clearly flagging missing sections, allowing you to manually verify the inaccessible content later.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-aurthes-aurthes-web-fetcher": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, external-api