guard
Deep AI safety guardrails workflow—policy definition, input/output filtering, monitoring, escalation, and false-positive handling. Use when reducing harmful outputs, misuse, or policy violations in LLM products.
Why use this skill?
Implement robust AI safety guardrails with OpenClaw. Manage policy, threat modeling, and input/output filtering to ensure secure, compliant, and reliable LLM applications.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/clawkk/guardWhat This Skill Does
The guard skill provides a rigorous, multi-stage framework for implementing AI safety and governance within LLM applications. It moves beyond simple keyword filtering by providing a structured six-stage pipeline that covers policy definition, threat modeling, control stack design, implementation, monitoring, and iteration. This skill is designed to translate abstract legal and product requirements into enforceable, reproducible technical behaviors, such as input/output filtering, automated refusals, and human-in-the-loop review. By using this skill, developers can effectively mitigate risks like jailbreak attempts, prompt injection, PII leakage, and non-compliant content generation.
Installation
To integrate this safety framework into your OpenClaw environment, execute the following command in your terminal:
clawhub install openclaw/skills/skills/clawkk/guard
Ensure your project repository is initialized with OpenClaw prior to installation to manage dependencies and versioning effectively.
Use Cases
- Consumer-Facing Chatbots: Implementing strict content moderation to prevent hate speech, sexual content, or harmful advice in public-facing interfaces.
- Enterprise Data Agents: Configuring defense-in-depth for internal bots, focusing specifically on data exfiltration, connector access restrictions, and PII masking.
- Regulated Industry Compliance: Automating required disclaimers and refusal logic for medical, financial, or legal advice bots that must adhere to strict regional compliance standards.
- Public LLM API Wrappers: Adding a safety layer to third-party model outputs to ensure that model hallucinations or policy violations are caught before reaching the end user.
Example Prompts
- "Analyze our current chatbot's vulnerabilities to prompt injection and suggest a list of input screening controls to implement using the guard skill."
- "Help me draft a policy scope document for a health-tech AI assistant, ensuring we cover compliance for HIPAA and standard medical disclaimer requirements."
- "Set up a dashboard monitoring strategy to track false-positive rates for our moderation filters across three different geographic regions."
Tips & Limitations
- Defense in Depth: Never rely on a single classifier. Always combine input screening with output monitoring and tool-calling sandboxes to ensure comprehensive safety.
- Latency Trade-offs: High-security filtering can introduce latency. Always define your latency budget early in the design phase to avoid degradation of user experience.
- Human Review: The most effective systems include a human-in-the-loop component. Use the monitoring and appeals stage to refine your policies based on actual borderline cases.
- Silent Failures: Avoid silent failures. Ensure every block or rewrite action is logged with telemetry so you can investigate and iterate on your policy triggers.
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-clawkk-guard": {
"enabled": true,
"auto_update": true
}
}
}Tags(AI)
Related Skills
data-move
Deep data migration workflow—scope, mapping, validation, batching and ordering, dual-write and cutover, rollback, and reconciliation. Use when moving tenants, bulk backfills, or changing stores without losing trust in data correctness.
data-model
Deep data modeling workflow—grain, facts and dimensions, keys, slowly changing dimensions, normalization trade-offs, and analytics query patterns. Use when designing warehouse/analytics models or reviewing star/snowflake schemas.
prompts
Deep prompt engineering workflow—task spec, constraints, examples, evaluation sets, iteration protocol, regression testing, and safety alignment. Use when improving LLM outputs, shipping prompt changes, or building reusable prompt templates.
客诉处理
提供客诉处理的可落地指南与SOP。在开展客诉处理相关工作时调用。
cost-opt
Cloud cost review: rightsizing, reservations, waste. Use when reducing infra spend.