ollama-memory-embeddings
Configure OpenClaw memory search to use Ollama as the embeddings server (OpenAI-compatible /v1/embeddings) instead of the built-in node-llama-cpp local GGUF loading. Includes interactive model selection and optional import of an existing local embedding GGUF into Ollama.
Why use this skill?
Optimize OpenClaw memory search by using Ollama for embedding vectors. Includes auto-configuration, model imports, and drift-prevention watchdogs for stable RAG performance.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/vidarbrekke/ollama-memory-embeddingsWhat This Skill Does
The ollama-memory-embeddings skill serves as a high-performance configuration bridge between OpenClaw and Ollama. By default, OpenClaw utilizes a built-in node-llama-cpp loader for memory indexing. This skill shifts that responsibility to Ollama's highly optimized /v1/embeddings endpoint. It handles the complete lifecycle: from verifying Ollama availability to model selection, registry importing, and surgical configuration injection. Key features include automatic model name normalization, drift-enforcement via an included watchdog script, and the ability to seamlessly migrate existing local GGUF embedding models into the Ollama ecosystem. It specifically targets the memory-search sub-system, ensuring your chat completions remain independent of your memory embedding provider. The skill supports both interactive setup and non-interactive scripted deployments for large-scale or automated environment management.
Installation
To install, ensure you have Ollama running on your local machine. You can install the skill directly via the repository using bash ~/.openclaw/skills/ollama-memory-embeddings/install.sh. For advanced users, non-interactive flags allow for full automation. Use --non-interactive along with --model and --reindex-memory flags to provision your environment instantly. If you are managing multiple agents, use the --install-watchdog option to ensure that your configuration drift is auto-healed at your specified interval, keeping your embedding pipeline stable and performant.
Use Cases
- High-Performance Memory: If you notice latency in memory retrieval when using default GGUF loading, switching to Ollama’s optimized backend often yields faster vector generation.
- Unified Infrastructure: Centralize all AI workloads—both chat and memory indexing—within a single Ollama instance rather than splitting them between node-llama-cpp and Ollama.
- Environment Parity: Use the non-interactive install flags to ensure that every development environment in your team uses the exact same embedding model (e.g., 'nomic-embed-text'), preventing performance discrepancies during retrieval-augmented generation (RAG) tasks.
Example Prompts
- "OpenClaw, update my memory search provider to Ollama using the nomic-embed-text model and reindex all current documents."
- "Run the memory embeddings drift check and ensure the config matches my Ollama setup."
- "Switch my memory indexing to the highest quality model available in Ollama and restart the gateway."
Tips & Limitations
- Embeddings Only: This skill specifically modifies memory search. Your chat LLM model remains configured by your primary agent settings; changing the embedding provider will not change your chat model.
- Watchdog Overhead: Enabling the auto-healing watchdog adds a minimal background process. Keep your interval above 60 seconds to maintain optimal system performance.
- Import Warning: While auto-importing GGUFs is convenient, ensure your existing GGUFs are compatible with Ollama’s versioning before running the import utility to avoid potential initialization failures.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-vidarbrekke-ollama-memory-embeddings": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: file-write, file-read, external-api, code-execution