Back to Registry View Author Profile
Official Verified
paper-repro-python
This skill should be used when the user asks to "reproduce a paper", "implement paper methods in Python", "extract paper content to Markdown", or works on paper reproduction tasks. Use for TeX-first extraction, modular Python implementation, and bilingual documentation.
skill-install — Terminal
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/celynnmoonlight/paper-repro-pythonOr
Follow this workflow end-to-end unless the user explicitly asks to skip steps
1) Intake and scope
- Confirm input artifacts: TeX source path(s), PDF path, supplementary files, target repository, and expected outputs.
- State assumptions explicitly when information is missing.
- Keep approach adaptable to the specific paper; do not force a fixed dependency stack or rigid project template.
- Check whether the working folder already contains paper source files (
.tex,.bib, style files, figures). - Check whether the working folder contains user-preprocessed documents (
.md,.json, images such as.png,.jpg,.svg). - Source priority rule (read in order, stop when sufficient):
- TeX sources (preferred): If usable TeX source files (
.tex,.bib, style files) are present, use them as the primary source. - User-preprocessed documents (secondary): If TeX is absent or incomplete, read user-provided documents (
.md,.json) and images (.png,.jpg,.svg) that may contain pre-extracted paper content. - PDF fallback (last resort): Only when both TeX and user-preprocessed documents are unavailable or insufficient, fall back to PDF extraction.
- TeX sources (preferred): If usable TeX source files (
2) Source extraction (TeX → preprocessed docs → PDF)
-
TeX path (highest priority):
- Parse and read the main TeX project structure first (
main.texor equivalent entry file and includes). - Preserve original scientific wording when converting relevant content to Markdown notes.
- Resolve equations, theorem blocks, citations, and appendices from source files whenever possible.
- Record unresolved include/bibliography issues explicitly; do not invent missing content.
- Parse and read the main TeX project structure first (
-
User-preprocessed documents path (secondary):
- Read Markdown files (
.md) that may contain paper content extracted by the user. - Read JSON files (
.json) that may contain structured paper data (metadata, sections, references). - View images (
.png,.jpg,.svg) that may contain paper figures, tables, or scanned pages. - Preserve original content; do not summarize or paraphrase.
- Note the source of each piece of information (which file, which section).
- Read Markdown files (
-
PDF fallback path (lowest priority, when all else fails):
- Extract paper content page by page into Markdown, preserving the original wording.
- Do not summarize, paraphrase, or rewrite scientific statements.
- Preserve structure faithfully:
- Title, authors, affiliations, abstract, sections, subsections.
- Equations (LaTeX-friendly when possible), theorem/lemma/proposition blocks.
- Tables, figure captions, references, appendices, footnotes.
- If a PDF is scanned or partially unreadable:
- Run OCR and mark uncertain spans clearly.
- Never silently invent missing text.
- Include image references/placeholders when figures cannot be represented as plain text.
- Produce one primary output file such as
paper_fulltext.md.
3) Extraction quality checks
Metadata
AI Skill Finder
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skill Add to Configuration
Paste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-celynnmoonlight-paper-repro-python": {
"enabled": true,
"auto_update": true
}
}
}Safety NoteClawKit audits metadata but not runtime behavior. Use with caution.