semfind
Semantic search over local text files using embeddings. Use when grep/ripgrep fails to find relevant results because the exact wording is unknown, or when searching by meaning rather than pattern — e.g., searching logs for "deployment issue" when the actual text says "container build failed". Install with `pip install semfind`. Ideal for searching memory files, project docs, logs, and notes by meaning.
Why use this skill?
Enhance your terminal search with semfind. Use AI-powered semantic search to find files based on meaning, not just keywords. Fast, local, and no APIs required.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/paperboardofficial/semfindWhat This Skill Does
semfind is a specialized semantic search tool designed for the command line, bringing vector-search capabilities to your local files. Unlike traditional text-matching tools like grep or ripgrep, which rely on exact keyword matches, semfind utilizes local embeddings via the BAAI/bge-small-en-v1.5 model and FAISS indexing. By converting your text documents into vector representations, the tool understands the conceptual meaning behind your search queries. This allows you to find information based on intent or context, even if the specific words you are searching for are not explicitly present in the target files. It is an essential utility for navigating large repositories of technical documentation, messy log files, and fragmented personal notes.
Installation
To begin using the tool, ensure you have a Python environment ready. Install the package globally or within your project environment using pip install semfind.
Alternatively, if you are using the OpenClaw agent ecosystem, you can install it via the hub command:
clawhub install openclaw/skills/skills/paperboardofficial/semfind
Upon the first execution, semfind will automatically download the necessary model weights (approximately 65MB), which may take a few seconds. These are cached locally to ensure that subsequent searches are lightning-fast, typically returning results in under 15ms once the model is loaded.
Use Cases
- Log Analysis: When debugging, you might know that a service failed due to a "database timeout," but the actual log line says "connection refused by remote host."
semfindbridges this gap. - Knowledge Retrieval: Perfect for searching through "second brain" markdown notes where you remember the concept of a saved snippet but not the exact terminology used at the time of writing.
- Code Documentation: Quickly locate relevant sections in READMEs or internal wiki files when the specific function naming convention is unclear.
Example Prompts
- "Use semfind to search through my project docs for anything related to authentication failures, even if the word 'auth' isn't explicitly mentioned."
- "Search the memory directory for 'container build failed' and give me the top 5 most relevant hits with 2 lines of context each."
- "I'm having a hard time finding the database configuration notes; look through the logs and notes folder for 'how to connect to the production db' and re-index the files first to be safe."
Tips & Limitations
- Hierarchy of Search: Always start with
greporripgrepfor simple string lookups. These tools are faster and have near-zero overhead compared to the semantic embedding process. - Resource Awareness: While efficient,
semfindconsumes ~250MB of RAM during operation. It is designed for interactive use rather than high-frequency automated batch processing. - Accuracy: Because it uses probabilistic semantic matching, results include a similarity score. Pay attention to scores below 0.5, as these represent lower confidence matches. Use the
-mflag to filter out low-relevance results if you find the noise level too high.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-paperboardofficial-semfind": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-read