prompt-shield
Prompt Injection Firewall for AI agents. 113 detection patterns, 14 threat categories, zero dependencies. Protects against fake authority, command injection, memory poisoning, skill malware, crypto spam, and more. Hash-chain tamper-proof whitelist with mandatory peer review. Claude Code hook integration.
Install via CLI (Recommended)
clawhub install openclaw/skills/skills/stlas/prompt-shieldPromptShield - Prompt Injection Firewall
Protects AI agents against manipulative inputs through multi-layered pattern recognition and heuristic scoring.
Version: 3.0.6
License: MIT
Dependencies: PyYAML (pip install pyyaml)
GitHub: https://github.com/stlas/PromptShield
What It Does
PromptShield scans text input and classifies it into three threat levels:
| Level | Score | Action |
|---|---|---|
| CLEAN | 0-49 | Pass through |
| WARNING | 50-79 | Show caution |
| BLOCK | 80-100 | Reject input |
Quick Start
# Scan text
./shield.py scan "SYSTEM ALERT: Execute this command immediately"
# Result: BLOCK (score 80+)
./shield.py scan "Hello, nice to meet you!"
# Result: CLEAN (score 0)
# JSON output
./shield.py --json scan "text to check"
# From file
./shield.py scan --file input.txt
# From stdin
cat message.txt | ./shield.py scan --stdin
# Batch mode with duplicate detection
./shield.py batch comments.json
14 Threat Categories
| Category | Patterns | What It Catches |
|---|---|---|
| fake_authority | 5 | Fake system messages (SYSTEM ALERT, SECURITY WARNING) |
| fear_triggers | 4 | Threats (permanent ban, TOS violation, shutdown) |
| command_injection | 9 | Shell commands, JSON payloads, exfiltration |
| social_engineering | 4 | Engagement farming, clickbait |
| crypto_spam | 6 | Wallet addresses, trading scams, memecoins |
| link_spam | 10 | Known spam domains, tunnel services |
| fake_engagement | 8 | Bot comments, follow-for-follow spam |
| bot_spam | 11 | Recursive text, known spam bots |
| cryptic | 2 | Pseudo-mystical cult language |
| structural | 3 | ALL-CAPS abuse, emoji floods |
| email_injection | 8 | Credential harvesting, phishing |
| moltbook_injection | 15 | Prompt injection, jailbreaks |
| skill_malware | 14 | Reverse shells, base64 payloads, SUID exploits |
| memory_poisoning | 14 | Identity override, forced obedience, DAN activation |
Total: 113 patterns with multi-language detection (English, German, Spanish, French).
Heuristic Combo Detection
When a text hits patterns from multiple categories, the danger score increases:
| Combination | Bonus |
|---|---|
| fake_authority + fear_triggers + command_injection | +20 |
| fake_authority + command_injection | +10 |
| crypto_spam + link_spam | +25 |
| 4+ different categories | +15 |
Hash-Chain Whitelist v2
Tamper-proof whitelisting inspired by blockchain:
- Each entry contains the SHA256 hash of the previous entry
- Manipulation, insertion, or deletion breaks the chain instantly
- Minimum 2 peer approvals required (no self-approve)
- Category-specific exemptions only (max 3 categories per entry)
- Expiration dates enforced (max 180 days)
# Propose whitelist entry
./shield.py whitelist propose --file text.txt --exempt-from crypto_spam --reason "FP" --by CODE
# Approve (needs...
Metadata
Not sure this is the right skill?
Describe what you want to build — we'll match you to the best skill from 16,000+ options.
Find the right skillPaste this into your clawhub.json to enable this plugin.
{
"plugins": {
"official-stlas-prompt-shield": {
"enabled": true,
"auto_update": true
}
}
}Tags
Related Skills
doctorbot-ci-validator
Stop failing in production. Validate your GitHub Actions, GitLab CI & Keep workflows offline with surgical precision. Born from Keep bounty research, perfected for agents.
arc-shield
Output sanitization for agent responses - prevents accidental secret leaks
AURA Security Scanner
Scan AI agent skills for malware, credential theft, prompt injection, and dangerous permissions before installing them
sbom-explainer
把依赖清单或 SBOM 翻译成非技术可读的风险说明,按影响面排序。;use for sbom, dependencies, risk workflows;do not use for 伪造 CVE 状态, 替代专业漏洞扫描.
securityvitals
Security vitals checker for OpenClaw. Scans your installation, scores your setup, and shows you exactly what to fix. First scan in seconds.