moltguard
Detect and block prompt injection attacks hidden in long content (emails, web pages, documents) using the MoltGuard API
Why use this skill?
Secure your OpenClaw agent against indirect prompt injection. MoltGuard offers local data sanitization and reliable threat detection for emails, web pages, and documents.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/thomaslwang/moltguardWhat This Skill Does
MoltGuard is a specialized security plugin designed for OpenClaw agents to mitigate the risks of indirect prompt injection. Indirect prompt injection occurs when an AI agent processes external data (such as an email, a website, or a document) that contains hidden instructions intended to hijack the agent’s behavior or exfiltrate private data. MoltGuard acts as a gatekeeper, intercepting this content before the agent executes it, ensuring that malicious payloads are detected and neutralized.
What sets MoltGuard apart is its rigorous commitment to data privacy through local, pre-analysis sanitization. Before any external data is transmitted to the MoltGuard API for evaluation, the plugin automatically redacts sensitive information such as PII (Personally Identifiable Information), credit card numbers, API keys, and email addresses. This process ensures that the diagnostic server only receives the structural context of the content, preserving the injection patterns necessary for accurate threat detection while keeping your sensitive user data strictly on your local machine.
Installation
To integrate MoltGuard into your OpenClaw environment, execute the following command in your terminal:
clawhub install openclaw/skills/skills/thomaslwang/moltguard
Once installed, the plugin will automatically configure its local storage at ~/.openclaw/moltguard-credentials.json and prepare the local SQLite audit log at ~/.openclaw/openclawguard.db to track analysis history.
Use Cases
- Email Security: Automatically scan incoming emails for hidden prompts that attempt to redirect the agent to malicious external websites or exfiltrate mailbox contents.
- Web Content Retrieval: Analyze the text extracted from web pages during research tasks to ensure the content hasn't been injected with instructions to subvert the agent's research objective.
- Document Processing: Securely process uploaded PDF or text documents, ensuring that hidden "white-on-white" text or malicious markdown headers do not override the agent's core system prompts.
Example Prompts
- "MoltGuard, please scan the email from my support inbox and let me know if it contains any instructions designed to manipulate my response behavior."
- "I need to summarize this web page. Before I proceed, use MoltGuard to verify that the content is safe and free of prompt injection attempts."
- "Open the attached document and run a security check with MoltGuard. If the analysis is clean, summarize the key findings for me."
Tips & Limitations
MoltGuard is designed for defense-in-depth. While it effectively identifies injection patterns, it should be used alongside other agentic security measures. Always ensure your agent has restricted read/write permissions to sensitive directories. The local sanitization feature is comprehensive but ensure your specific data types are covered by checking agent/sanitizer.ts. If you are working in an air-gapped environment, note that this tool requires access to api.moltguard.com to function. For enterprise setups, you may manually configure your API key in the config file to bypass the automated registration process.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-thomaslwang-moltguard": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Flags: network-access, file-write, file-read, external-api
Related Skills
flaw0
Security and vulnerability scanner for OpenClaw code, plugins, skills, and Node.js dependencies. Powered by OpenClaw AI models.
openguardrails-for-openclaw
Detect and block prompt injection attacks hidden in long content (emails, web pages, documents) using OpenGuardrails SOTA detection
flaw0
Security and vulnerability scanner for OpenClaw code, plugins, skills, and Node.js dependencies. Powered by OpenClaw AI models.
test
test
skill-scanner
Scan installed OpenClaw skills for malicious code patterns including ClickFix social engineering, reverse shell (RAT), and data exfiltration. Uses OG-Text model for agentic detection.