What This Skill Does

The Sanitize skill by AgentWard is a robust, zero-dependency utility designed to identify and redact sensitive personally identifiable information (PII) from text-based files. In an era where data privacy is paramount, this tool provides an essential layer of security by replacing sensitive data points—such as Social Security Numbers, credit card details, API keys, and medical license numbers—with sanitized, numbered placeholders (e.g., [CREDIT_CARD_1]). By scanning files locally and ensuring that raw PII is never exposed to the standard output during normal operations, it helps organizations maintain compliance and protect user privacy.

Installation

To install this skill, use the following command in your terminal or via the OpenClaw management console:

clawhub install openclaw/skills/skills/agentward-ai/sanitize

Ensure that you have Python 3.x installed in your environment, as the skill is executed directly via Python scripts. No external dependencies are required, making it ideal for air-gapped environments or highly secure CI/CD pipelines.

Use Cases

This tool is critical for developers, data scientists, and security professionals handling unstructured text files that may inadvertently contain sensitive information. Common use cases include:

Sanitizing server logs before sharing them with debugging teams.
Preparing datasets for AI model fine-tuning by stripping real customer identifiers.
Redacting patient or client notes before uploading documents to cloud storage or collaboration platforms.
Scanning configuration files for leaked API keys or secret credentials before committing to version control.

Example Prompts

"Agent, please sanitize the customer-logs.txt file located in my home directory and save the result as sanitized-logs.txt, keeping only email and phone number categories."
"Can you run a preview scan on the file client-records.md? I want to see which PII categories are detected without exposing the raw data values."
"Run a full sanitization on project-report.txt, generate the JSON report, and ensure all 15 supported PII categories are caught in the output file clean-report.txt."

Tips & Limitations

Safety First: Never read the raw input file directly. The tool is designed to work via the --output flag. Always treat your source files as toxic and interact only with the sanitized output.
Sidecar Files: When you use the --output flag, a *.entity-map.json file is created. Do not read or share this file, as it maps the placeholders back to the sensitive data.
Performance: The tool is highly efficient due to its zero-dependency implementation, but extremely large text files may require significant system memory. Process files in smaller chunks if you experience performance degradation.
False Positives: While the tool is highly accurate, context-dependent patterns (like specific alphanumeric sequences) may occasionally trigger false positives. Always verify the output if your use case involves sensitive medical or legal documents.

sanitize

Install via CLI (Recommended)

What This Skill Does

Installation

Use Cases

Example Prompts

Tips & Limitations

Metadata

Tags(AI)