ClawKit Logo
ClawKitReliability Toolkit
Back to Registry
Official Verified developer tools Safety 4/5

observer-effect-probe

Helps detect skills that behave differently when they sense they are being monitored — catching the class of evasion where conditional activation logic specifically targets attestation and sandbox environments.

skill-install — Terminal

Install via CLI (Recommended)

clawhub install openclaw/skills/skills/andyxinweiminicloud/observer-effect-probe
Or

What This Skill Does

The observer-effect-probe is a sophisticated diagnostic tool designed for the OpenClaw agent ecosystem. It addresses the 'observer effect' in software security: the phenomenon where malicious or evasive skills alter their behavior when they detect they are being inspected. By minimizing the observable differences between sandbox environments and production environments, this probe helps security researchers and developers identify skills that use environment detection to hide payloads or suppress malicious activity during attestation.

Installation

To integrate this probe into your environment, run the following command in your terminal: clawhub install openclaw/skills/skills/andyxinweiminicloud/observer-effect-probe Ensure you have the latest version of the ClawKit CLI installed to facilitate proper skill execution.

Use Cases

This skill is primarily intended for security auditing and agent integrity validation. Use it when you need to verify that a third-party agent skill is not employing conditional logic to bypass security checks. It is particularly effective when testing skills that interact with external APIs or system-level processes, where evasion tactics like sandbox fingerprinting are most common. Use this tool during pre-deployment security reviews to ensure your agent ecosystem remains hardened against adversarial code.

Example Prompts

  1. "Run an observer-effect-probe on the agent skill 'data-fetcher-v2' and analyze if it shifts output based on hostname variations."
  2. "Assess the target skill for sandbox fingerprinting attempts; report any observed API pattern detection or timing sensitivity."
  3. "Execute a multi-stage validation check on the suspicious plugin to see if it remains consistent across varied uptime and network access levels."

Tips & Limitations

The observer-effect-probe is a powerful tool but requires careful interpretation. It does not guarantee the total absence of evasion; instead, it raises the bar for skill developers. Always run this probe in an isolated container to prevent accidental trigger of malicious behavior. Note that highly advanced skills may still detect the probe itself if they utilize side-channel analysis of the system hardware. For best results, rotate your environment fingerprints frequently to confuse defensive routines within the target skill. Be aware that false positives may occur if a legitimate skill is poorly written and inadvertently checks for its execution environment for non-malicious reasons, such as localization or resource optimization.

Metadata

Stars4473
Views1
Updated2026-05-01
View Author Profile
AI Skill Finder

Not sure this is the right skill?

Describe what you want to build — we'll match you to the best skill from 16,000+ options.

Find the right skill
Add to Configuration

Paste this into your clawhub.json to enable this plugin.

{
  "plugins": {
    "official-andyxinweiminicloud-observer-effect-probe": {
      "enabled": true,
      "auto_update": true
    }
  }
}

Tags(AI)

#security#attestation#sandbox#evasion
Safety Score: 4/5

Flags: code-execution, network-access

Related Skills

delta-disclosure-auditor

Helps verify that skill updates publish an auditable record of what changed — catching the gap between "the registry shows the new version" and "anyone can see what the new version changed relative to the old one." v1.1 adds risk-class binding, chain-of-custody verification, and update eligibility assessment.

andyxinweiminicloud 4473

capability-composition-analyzer

Helps identify dangerous capability combinations that emerge when agent skills are composed — catching the class of risk where no individual skill is harmful but their intersection creates an exfiltration or compromise path.

andyxinweiminicloud 4473

transparency-log-auditor

Helps verify that skill signing events are recorded in an independently auditable transparency log — catching the class of trust failures where a registry operator can silently rewrite history without detection.

andyxinweiminicloud 4473

behavioral-invariant-monitor

Helps verify that AI agent skills maintain consistent behavioral invariants across repeated executions — detecting the class of threat where a skill behaves safely during initial evaluation but shifts behavior based on execution count, environmental conditions, or delayed activation triggers. v1.3 adds performance fingerprinting (computational complexity drift detection), cryptographic audit trails (hash-chained behavior logs for immutable verification), and risk-proportional monitoring (sampling-based checks to reduce overhead).

andyxinweiminicloud 4473

skill-update-delta-monitor

Helps detect security-relevant changes in AI skills after installation. Tracks deltas between the audited version and current version, flagging updates that expand permissions, add new network endpoints, or alter behavior in ways that bypass install-time security checks.

andyxinweiminicloud 4473