ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified

literature-manager

Search, download, convert, organize, and audit academic literature collections. Use when asked to find papers, build a literature library, add papers to references, download PDFs, convert papers to markdown, organize references by category, audit a reference collection, or collect code/dataset links for tools mentioned in papers.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/isonaei/literature-manager
Or

Literature Manager

Manage academic literature collections: search → download → convert → organize → verify.

Dependencies

  • pdftotext (poppler-utils) — PDF text extraction
  • curl — downloading
  • python3 — JSON processing in audit
  • file (coreutils) — PDF validation
  • uvx markitdown[pdf] (optional) — fallback PDF→MD converter (note: plain uvx markitdown does NOT work for PDFs — must use uvx markitdown[pdf])

Quick Start

# Download a single paper by DOI
bash scripts/download.sh "10.1038/s41592-024-02200-1" output_dir/

# Convert PDF to markdown
bash scripts/convert.sh paper.pdf output.md

# Verify a single PDF+MD pair
bash scripts/verify.sh paper.pdf paper.md

# Full audit of a references/ folder
bash scripts/audit.sh /path/to/references/

Workflow

1. Search

Use web_fetch on Google Scholar:

https://scholar.google.com/scholar?q=QUERY&as_ylo=YEAR

Extract: title, authors, year, journal, DOI, PDF links.

For each result, identify the best open-access PDF source (see Download Strategy).

2. Download

Run scripts/download.sh <DOI_or_URL> <output_dir/> per paper. The script tries sources in order:

  1. Direct publisher PDF (Nature, eLife, Frontiers, PNAS, bioRxiv, arXiv)
  2. EuropePMC (PMC_ID → PDF)
  3. bioRxiv/arXiv preprint
  4. Sci-Hubhttps://sci-hub.box/<DOI> (use when publisher is paywalled)
# Sci-Hub download example:
curl -L "https://sci-hub.box/10.1038/nature12345" -o paper.pdf

⚠️ Legal note: Sci-Hub may violate publisher terms of service or copyright law in some jurisdictions. Use only if you understand and accept the legal implications in your context.

If all sources fail (including Sci-Hub), flag as permanent paywall. Provide the user with the DOI and ask for manual download.

3. Convert

Run scripts/convert.sh <input.pdf> <output.md>. Uses pdftotext (reliable) with uvx markitdown[pdf] as fallback.

# Correct markitdown command for PDFs:
uvx markitdown[pdf] input.pdf > output.md

# ⚠️ The following will NOT work for PDFs (missing [pdf] extra):
# uvx markitdown input.pdf

Prefer uvx markitdown[pdf] over pdftotext when full fidelity (tables, figures captions) matters.

4. Organize

Standard folder structure:

references/
├── README.md              # Human index (summaries per category)
├── index.json             # Machine index (structured metadata)
├── RESOURCES.md           # Code repos + datasets
├── resources.json         # Structured version
├── <category-1>/
│   ├── papers/            # PDFs
│   └── markdown/          # Converted text
└── <category-N>/
    ├── papers/
    └── markdown/

Categories are user-defined. Number-prefix for sort order (e.g., 01-theoretical-frameworks/).

Metadata

Author@isonaei
Stars2190
Views0
Updated2026-03-07
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-isonaei-literature-manager": {
      "enabled": true,
      "auto_update": true
    }
  }
}
Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.