agentic-paper-digest-skill
Fetches and summarizes recent arXiv and Hugging Face papers with Agentic Paper Digest. Use when the user wants a paper digest, a JSON feed of recent papers, or to run the arXiv/HF pipeline.
Why use this skill?
Fetch, filter, and summarize the latest research papers from arXiv and Hugging Face automatically using this AI-powered paper digest skill for OpenClaw.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/matanle51/agentic-paper-digestWhat This Skill Does
The agentic-paper-digest-skill is a powerful utility designed for researchers, developers, and AI enthusiasts who need to stay informed about the rapidly evolving landscape of machine learning and computer science research. It automates the retrieval and processing of recent papers from two primary sources: arXiv and Hugging Face. Instead of manually scanning endless feeds, this skill leverages an AI-powered pipeline to filter, analyze, and summarize relevant content based on your specific interests.
At its core, the skill uses LLMs to perform relevance scoring and summarization, ensuring that you only receive summaries of papers that truly matter to your workflows. It supports both a command-line interface for ad-hoc requests and a background API mode that can trigger periodic polling or integration with other agentic tools. The data is persisted in a local SQLite database, allowing for historical queries and longitudinal tracking of research trends.
Installation
To get started, ensure you have Python 3 and access to a git-enabled environment. The easiest method is to use the provided bootstrap script:
bash "{baseDir}/scripts/bootstrap.sh"
If you prefer to maintain the repository in a custom directory, set the PROJECT_DIR variable before running the bootstrap. The skill requires an LLM provider—you can configure either OPENAI_API_KEY or an OpenAI-compatible proxy via LITELLM_API_BASE. Once set, you can run the CLI tool directly or initialize the API service to maintain a persistent data store.
Use Cases
- Curated Research Feeds: Automatically generate a daily or hourly newsletter of relevant papers in your specific fields (e.g., LLMs, Computer Vision, or Robotics).
- Downstream Data Integration: Export JSON digests of recent publications to feed into automated documentation systems or Slack/Discord notification bots.
- Agentic Research Assistant: Use the API endpoints to provide an "agent" with a memory of recent research, allowing you to ask questions about current industry trends.
- PDF Insights: Enable the PDF text extraction feature to get granular insights from the first pages of research papers.
Example Prompts
- "Run the paper digest for the last 24 hours focusing on arXiv categories cs.LG and cs.AI."
- "Check the current status of the paper digest API and show me the last 5 relevant papers ingested."
- "Fetch recent research from Hugging Face and output the results as JSON for my report."
Tips & Limitations
- Performance: When enabling
ENABLE_PDF_TEXT, be aware that PyMuPDF is required, and processing speed will be slower due to additional file I/O and PDF parsing overhead. - API Usage: Monitor your LLM API tokens, especially if you set
MAX_CANDIDATES_PER_SOURCEtoo high, as this triggers multiple LLM calls per paper. - Configuration: Always use a
.envfile within thePROJECT_DIRto keep your API keys secure and organized. - Network: Ensure the host machine has outbound access to arXiv and Hugging Face APIs; otherwise, the fetch step will fail.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-matanle51-agentic-paper-digest": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-write, file-read, external-api