ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 4/5

arxivkb

Local arXiv paper manager with semantic search. Crawls arXiv categories, downloads PDFs, chunks content, and indexes with FAISS + Ollama embeddings. No cloud API keys required — everything runs locally.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/camopel/arxivkb
Or

What This Skill Does

ArXivKB is a powerful, locally-hosted knowledge management system designed for researchers and engineers who need to keep up with the fast-paced world of academic publications. Unlike standard search engines that rely on cloud-based APIs, ArXivKB enables you to crawl official arXiv categories, download full PDF documents, and index their actual content using semantic vector embeddings. By leveraging FAISS for high-speed similarity search and Ollama for local text embedding, the skill provides a private, zero-cost, and offline-capable interface for navigating the latest research in Computer Science, Physics, Mathematics, and beyond.

Installation

Installation requires a machine running macOS or Linux with Python 3.10+ and a functional Ollama instance. Begin by cloning the source from openclaw/skills or using the CLI command: clawhub install openclaw/skills/skills/camopel/arxivkb. Follow this by running python3 scripts/install.py to set up your virtual environment, install critical dependencies like faiss-cpu and pdfplumber, and initialize your local SQLite and FAISS data stores. Ensure your Ollama server is running by executing ollama serve before performing your first ingestion.

Use Cases

This skill is ideal for researchers tracking niche subfields where staying current is vital. Use it to build a local library of papers for specific topics (e.g., Computer Vision or Robotics), perform semantic cross-referencing between new uploads and your existing repository, and maintain a lightweight offline archive without worrying about rate-limited cloud APIs or data privacy issues.

Example Prompts

  1. "Check the current stats of my local arXiv repository and then ingest any new papers from the last 7 days in the cs.LG category."
  2. "Search through my downloaded papers for any recent content discussing transformer architecture improvements and summarize the top 3 matches."
  3. "List all my currently tracked categories and delete the one for quant finance, then clean up any papers older than 60 days to save disk space."

Tips & Limitations

To keep performance high, rely on the --no-pdf flag during bulk ingestions if you are bandwidth-constrained or only need abstract-level searchability. Be mindful that because all embeddings occur locally via Ollama, processing very large quantities of papers may be CPU or GPU intensive. Regularly use the akb expire command to prevent your local ~/workspace/arxivkb directory from growing indefinitely, as vector indices and PDF storage can accumulate significant storage overhead over time.

Metadata

Author@camopel
Stars4072
Views1
Updated2026-04-13
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-camopel-arxivkb": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#arxiv#research#local-llm#knowledge-base#vector-db
Safety Score: 4/5

Flags: network-access, file-write, file-read

Related Skills

ddgs-search

Free multi-engine web search via ddgs CLI (DuckDuckGo, Google, Bing, Brave, Yandex, Yahoo, Wikipedia) + arXiv API search. No API keys required. Use when user needs web search, research paper discovery, or when other skills need a search backend. Drop-in replacement for web-search-plus.

camopel 4072

finviz-crawler

Continuous financial news crawler for finviz.com with SQLite storage, article extraction, and query tool. Use when monitoring financial markets, building news digests, or needing a local financial news database. Runs as a background daemon or systemd service.

camopel 4072

privateapp

Personal PWA dashboard server with plugin apps. Use when: (1) installing or setting up PrivateApp, (2) starting/stopping/restarting the service, (3) building frontends after changes, (4) adding new app plugins, (5) configuring push notifications. Requires Python 3.9+, Node.js 18+. Runs as systemd user service or launchd plist.

camopel 4072

storage-cleanup

One-command disk cleanup for macOS and Linux — trash, caches, temp files, old kernels, snap revisions, Homebrew, Docker, and Xcode artifacts. Use when user asks to free storage, clean up disk, reclaim space, reduce disk usage, or encounters low disk / "disk full" warnings. Safe by default with dry-run mode. No dependencies beyond bash and awk.

camopel 4072

claw-guard

System-level watchdog for OpenClaw gateway restarts and sub-agent task PIDs. Monitors registered PIDs and optional log/directory freshness. Auto-reverts config on failed gateway restarts. Requires explicit registration — does NOT auto-discover. Use when running long background tasks or before gateway restarts.

camopel 4072