ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 5/5

rag-engineer

Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications. Use when: building RAG, vector search, embeddings, semantic search, document retrieval.

Why use this skill?

Expertly design and optimize your Retrieval-Augmented Generation systems. Improve your LLM's accuracy with professional chunking, hybrid search, and retrieval strategies.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/mupengi-bot/rag-engineer
Or

What This Skill Does

The RAG Engineer skill serves as your architectural backbone for Retrieval-Augmented Generation (RAG) systems. It transforms raw, unstructured data into a high-fidelity knowledge retrieval system, ensuring that your LLM applications provide accurate, context-aware, and reliable responses. By focusing on the critical phases of the RAG pipeline—chunking, embedding, vector database management, and retrieval optimization—this skill mitigates the common pitfalls of hallucinations and poor source material grounding. It acts as a specialized consultant that guides you through the complex decisions of balancing performance, cost, and accuracy in your search infrastructure.

Installation

To install this skill, run the following command in your terminal: clawhub install openclaw/skills/skills/mupengi-bot/rag-engineer

Use Cases

  • Building enterprise-grade knowledge bases for customer support chatbots.
  • Designing document retrieval systems for legal or medical research assistants.
  • Implementing semantic search functionality for internal corporate wikis.
  • Optimizing existing RAG pipelines that suffer from low accuracy or outdated information.
  • Creating hybrid search engines that combine keyword-based precision with vector-based semantic understanding.

Example Prompts

  1. "I am struggling with my RAG pipeline; documents are being split mid-sentence, causing retrieval errors. Can you help me implement a semantic chunking strategy that respects paragraph boundaries?"
  2. "Compare the pros and cons of using HNSW index vs. IVF-FLAT in my vector database for a collection of 500,000 technical manuals."
  3. "My system is getting high retrieval scores but the LLM isn't using the data correctly. Could you help me design a reranking strategy using Cross-Encoders to improve precision?"

Tips & Limitations

  • Quality over Quantity: Focus on clean data preprocessing. Garbage in leads to garbage out regardless of your embedding model.
  • Metadata is Key: Always use metadata filtering for your vector searches; pure semantic search often fails on ambiguous terminology.
  • Continuous Evaluation: Treat retrieval evaluation as a separate task from LLM output evaluation. Use tools like RAGAS to measure faithfulness and answer relevance.
  • Don't Over-embed: Avoid embedding everything. Strategic indexing of critical sections often yields better results than naive ingestion.
  • System Complexity: This skill provides architecture advice but requires integration with your existing vector database (like Pinecone, Weaviate, or Milvus).

Metadata

Stars1335
Views1
Updated2026-02-23
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-mupengi-bot-rag-engineer": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#rag#vector-search#embeddings#nlp#llm-architecture
Safety Score: 5/5