ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 3/5

kekik-crawler

Scrapling-only, deterministic web crawler with clean SRP architecture, presets, checkpointing, and JSONL/report outputs.

Why use this skill?

Discover Kekik-crawler, a high-performance, headless web scraping skill for OpenClaw. Efficiently extract data with presets, JSONL output, and checkpointing support.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/keyiflerolsun/kekik-crawler
Or

What This Skill Does

Kekik-crawler is a high-performance, deterministic web crawler built for the OpenClaw ecosystem, specifically optimized for speed and structured data extraction. Unlike resource-heavy browser-based crawlers, this tool leverages the 'Scrapling' library to achieve rapid, headless data gathering without the overhead of rendering JavaScript. The architecture follows a strict Single Responsibility Principle (SRP), ensuring that crawling, parsing, and data serialization remain decoupled and maintainable. It features robust checkpointing capabilities to handle large-scale crawls and provides structured output via JSONL and comprehensive JSON reporting, making it an ideal choice for data scientists, OSINT investigators, and developers needing reliable web scraping pipelines.

Installation

To integrate this skill into your OpenClaw environment, use the internal skill manager:

clawhub install openclaw/skills/skills/keyiflerolsun/kekik-crawler

Ensure you have the necessary dependencies installed by running pip install -r requirements.txt within the skill directory. Once installed, the main entry point is main.py, which is orchestrated by the core/crawl_runner.py logic.

Use Cases

  • OSINT & Person Research: Utilize the person-research preset to aggregate information across various domains for specific identities or aliases.
  • Deep Research & Aggregation: Employ the deep-research preset to perform recursive crawls, ideal for building training datasets or comprehensive knowledge bases.
  • Automated Data Pipelines: Integrate into larger workflows where you need to extract structured data into JSONL formats for ingestion into vector databases or LLM fine-tuning pipelines.

Example Prompts

  1. "Use kekik-crawler with the person-research preset to find all mentions of 'John Doe' and save the output to my local outputs folder."
  2. "Perform a deep-research crawl on the tech news aggregate sites to gather content for my weekly analysis report."
  3. "Run the crawler on the provided URL list and ensure a full JSON report is generated for tracking purposes."

Tips & Limitations

  • Efficiency: Since this tool does not render JavaScript, it is incredibly fast, but it will not capture data from sites that rely solely on dynamic client-side rendering.
  • Storage: Always check the outputs/ directory for your JSONL data and summary reports. Periodically clear old files to maintain disk space.
  • Determinism: The tool is designed to be deterministic; if a crawl fails, simply restart the process using the existing checkpoint configuration to resume where it left off.

Metadata

Stars1776
Views1
Updated2026-03-02
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-keyiflerolsun-kekik-crawler": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#web-scraping#crawler#osint#data-extraction#headless
Safety Score: 3/5

Flags: network-access, file-write, file-read