Guardian

One layer in a multi-layer security stack for OpenClaw agents.

Real agent security requires multiple layers: OpenClaw's built-in capability restrictions and approval gates handle what the agent can do. Guardian handles what the agent sees — intercepting malicious inputs before they reach the model.

Guardian provides signature-based pre-model scanning for prompt injection, credential exfiltration attempts, tool abuse patterns, and social engineering attacks. It is not a complete security solution on its own. Use it alongside OpenClaw's tool allowlists, approval gates, and sandboxed execution for defense-in-depth.

Guardian provides two scanning modes:

Real-time pre-scan — checks each incoming message before it reaches the model
Batch scan — periodic sweep of workspace files and conversation logs

All data stays local. Cron setup is optional via scripts/onboard.py --setup-crons.

Scan results are stored in a SQLite database (guardian.db).

Installation

cd ~/.openclaw/skills/guardian
./install.sh

Install mechanism and review

This package includes executable scripts (including install.sh) and Python modules. Review install.sh before running in production. install.sh performs local setup/validation; optional helper onboard.py is opt-in for cron setup.

Onboarding checklist

Optional: python3 scripts/onboard.py --setup-crons (scanner/report/digest crons)
python3 scripts/admin.py status (confirm running)
python3 scripts/admin.py threats (confirm signatures loaded; should show 0/blocked)
Optional: review config.json scan_paths and threshold for your environment

First-load / self-activation

After install.sh completes, it writes .guardian-activate-pending to the workspace root (~/.openclaw/workspace/.guardian-activate-pending). When OpenClaw detects this marker on next load, it triggers onboard.py automatically for the self-activation flow. The marker is removed once onboard.py has run. If you prefer manual onboarding, simply delete the marker before reloading (rm ~/.openclaw/workspace/.guardian-activate-pending).

Scan scope and privacy

Guardian scans configured workspace paths to detect threats. Depending on scan_paths, this can include other skill/config files in your OpenClaw workspace. If you handle sensitive files, set narrow scan_paths in config.json.

Pre-publish safety workflow

Before any clawhub publish, run:

python3 scripts/pre_publish_check.py

If the check exits non-zero, do not publish until issues are fixed. The check respects .clawhubignore and blocks likely secret leaks (including token-like hex strings >24 chars and audit_exports/*.json if included).

Quick Start

# Check status
python3 scripts/admin.py status

# Scan recent threats
python3 scripts/guardian.py --report --hours 24

# Full report
python3 scripts/admin.py report

Admin Commands

python3 scripts/admin.py status          # Current status
python3 scripts/admin.py enable          # Enable scanning
python3 scripts/admin.py disable         # Disable scanning
python3 scripts/admin.py threats         # List detected threats
python3 scripts/admin.py threats --clear # Clear threat log
python3 scripts/admin.py dismiss INJ-004 # Dismiss a signature
python3 scripts/admin.py allowlist add "safe phrase"
python3 scripts/admin.py allowlist remove "safe phrase"
python3 scripts/admin.py update-defs     # Update threat definitions

Add --json to any command for machine-readable output.

Python API

from core.realtime import RealtimeGuard

guard = RealtimeGuard()
result = guard.scan_message(user_text, channel="telegram")
if guard.should_block(result):
    return guard.format_block_response(result)

Environment variables read

GUARDIAN_WORKSPACE (optional workspace override)
OPENCLAW_WORKSPACE (optional fallback workspace override)
GUARDIAN_CONFIG (optional guardian config path)
OPENCLAW_CONFIG_PATH (optional OpenClaw config path)

Configuration

Edit config.json:

| Setting | Description | |---|---| | enabled | Master on/off switch | | severity_threshold | Blocking threshold: low / medium / high / critical | | scan_paths | Paths to scan (["auto"] for common folders) | | db_path | SQLite location ("auto" = <workspace>/guardian.db) |

BL-048: Rate Limiting (`gateway.rateLimit`)

Per-source sliding-window rate limiting. Disabled by default (safe for existing deployments).

| Setting | Default | Description | |---|---|---| | gateway.rateLimit.enabled | false | Enable rate limiting | | gateway.rateLimit.requests_per_minute | 100 | Max requests per source per minute | | gateway.rateLimit.window_seconds | 60 | Sliding window size | | gateway.rateLimit.burst_multiplier | 1.5 | Allows short bursts (limit × multiplier) |

Usage via Python API:

scanner = GuardianScanner(record_to_db=False)
result = scanner.check_rate_limit("webhook")  # {"allowed": True/False, ...}

BL-049: Tool Allowlist (`tools`)

Validates tool invocations against a permitted list. Empty list = all tools allowed (safe default). Supports * glob wildcards.

| Setting | Default | Description | |---|---|---| | tools.allowlist | [] | Permitted tool names. Empty = all allowed | | tools.allowlist_mode | "warn" | "warn" (log only) or "block" (enforce) | | tools.case_sensitive | false | Case-sensitive tool name matching |

Example:

"tools": {
  "allowlist": ["read", "write", "web_*"],
  "allowlist_mode": "block"
}

Usage:

result = scanner.check_tool("exec", channel="telegram")  # {"allowed": False, ...}

BL-050: Audit Logging (`logging`)

Writes structured JSON security events to a rotating log file. Disabled by default.

| Setting | Default | Description | |---|---|---| | logging.audit | false | Enable audit log | | logging.audit_log_path | "auto" | Log file path ("auto" = <workspace>/guardian-audit.log) | | logging.audit_max_bytes | 10485760 | Max file size before rotation (10 MB) | | logging.audit_backup_count | 5 | Number of rotated files to keep |

Events logged: BLOCK, OVERRIDE, CONFIG_CHANGE, TOOL_VIOLATION, RATE_LIMIT.

Review Guardian's hardening posture:

python3 scripts/admin.py config-review

How It Works

Guardian loads threat signatures from definitions/*.json files. Each signature has an ID, regex pattern, severity level, and category. Incoming text is matched against all active signatures. Matches above the configured severity threshold are blocked and logged to the database.

Signatures cover: prompt injection, credential patterns (API keys, tokens), data exfiltration attempts, tool abuse patterns, and social engineering tactics.

Source Trust Levels

Guardian assigns every scan a trust level based on the source channel and message role. There are four levels: 0 – internal (cron jobs, workspace files, system prompts) is never blocked; 1 – owner (Telegram) is flagged for review but never blocked; 2 – semi-trusted (email, unknown sources) is blocked only when the threat score reaches 70 or above; 3 – external (webhooks) is blocked at a lower threshold of 50 or above. The role of the message can adjust the effective trust level: system messages shift one step toward internal, while tool results shift one step toward external. This prevents false positives on internal/cron content that may legitimately reference injection-like phrases (for example, in log output or documentation).

Guardian

Guardian

Installation

Install mechanism and review

Onboarding checklist

First-load / self-activation

Scan scope and privacy

Pre-publish safety workflow

Quick Start

Admin Commands

Python API

Environment variables read

Configuration

BL-048: Rate Limiting (gateway.rateLimit)

BL-049: Tool Allowlist (tools)

BL-050: Audit Logging (logging)

How It Works

Source Trust Levels

BL-048: Rate Limiting (`gateway.rateLimit`)

BL-049: Tool Allowlist (`tools`)

BL-050: Audit Logging (`logging`)