s07: Sandbox & Security — learn-openclaw

动机：为什么需要沙箱

到目前为止，我们的 agent 可以执行任何 bash 命令——包括 rm -rf /。它可以读取任何文件——包括 ~/.ssh/id_rsa。在开发环境里这可能没问题，但在生产环境中这是灾难性的安全漏洞。

真实的 agent 需要多层安全防护：

命令白名单：只允许执行特定的命令（git、ls、cat）
路径隔离：只允许访问工作区内的文件
注入检测：识别隐藏在合法命令中的攻击（ls; rm -rf /）
速率限制：防止失控循环耗尽资源

命令在被执行之前要经过多层验证：

命令安全验证链任何一层拒绝 → 整条命令拒绝

五道关卡依次检查 subshell 注入、重定向、背景链、命令白名单、路径穿越。任何一道拒绝就拒绝整条命令。全部通过后再做风险分级：LOW 直接执行，MEDIUM 需要审批，HIGH 直接拒绝。

试试输入不同的命令，观察哪一层拦截了它：

命令安全检查器输入命令试试

安全的命令（ls -la、git status）通过所有 5 层检查后进入风险分级。危险的命令在不同层被拦截——分号注入在 Chain 层被截获，路径穿越在 Path 层被截获。

Ground Truth：真实的安全实现

Security Policy——跨代码库对比

命令验证和路径隔离在 Rust agent 中的生产级实现

rustsecurity/policy.rs

1pub enum AutonomyLevel {
2  ReadOnly,    // 只能观察，不能行动
3  Supervised,  // 可以行动，高风险需审批
4  Full,        // 自主执行（在策略边界内）
5}
6 
7pub struct SecurityPolicy {
8  pub autonomy: AutonomyLevel,
9  pub workspace_dir: PathBuf,
10  pub workspace_only: bool,       // 限制在工作区内
11  pub allowed_commands: Vec<String>,
12  pub forbidden_paths: Vec<String>,
13  pub max_actions_per_hour: u32,
14  pub block_high_risk_commands: bool,
15  pub require_approval_for_medium_risk: bool,
16}
17 
18impl SecurityPolicy {
19  pub fn command_risk_level(&self, cmd: &str)
20      -> CommandRiskLevel
21  {
22      // High: rm, sudo, curl, wget, ssh
23      // Medium: git commit/push, npm install
24      // Low: git status, ls, cat, grep
25  }
26 
27  pub fn is_command_allowed(&self, cmd: &str) -> bool {
28      // 1. ReadOnly 模式拒绝一切
29      // 2. 检测 subshell ($(), 反引号)
30      // 3. 检测重定向 (>, <)
31      // 4. 检测背景链 (&)
32      // 5. 按分号/管道拆分, 逐段验证白名单
33  }
34 
35  pub fn is_path_allowed(&self, path: &str) -> bool {
36      // 1. 拒绝 null byte
37      // 2. 拒绝 .. 路径穿越
38      // 3. 拒绝 URL 编码穿越 (..%2f)
39      // 4. 展开 ~ 后检查禁止路径
40      // 5. workspace_only 时拒绝绝对路径
41  }
42}

rustsafety/policy.rs

1pub enum Severity { Low, Medium, High, Critical }
2pub enum PolicyAction { Warn, Block, Review, Sanitize }
3 
4pub struct PolicyRule {
5  pub id: String,
6  pub description: String,
7  pub severity: Severity,
8  pattern: Regex,
9  pub action: PolicyAction,
10}
11 
12impl Policy {
13  fn default() -> Self {
14      let mut p = Self::new();
15 
16      // 系统文件访问 -> Block
17      p.add_rule(PolicyRule::new(
18          "system_file_access",
19          "Attempt to access system files",
20          r"(/etc/passwd|/etc/shadow|\.ssh/)",
21          Severity::Critical, PolicyAction::Block,
22      ));
23 
24      // Shell 注入 -> Block
25      p.add_rule(PolicyRule::new(
26          "shell_injection",
27          "Potential shell command injection",
28          r";\s*rm\s+-rf|;\s*curl.*\|\s*sh",
29          Severity::Critical, PolicyAction::Block,
30      ));
31 
32      // SQL pattern -> Warn
33      p.add_rule(PolicyRule::new(
34          "sql_pattern",
35          "SQL-like pattern detected",
36          r"DROP\s+TABLE|DELETE\s+FROM",
37          Severity::Medium, PolicyAction::Warn,
38      ));
39 
40      p
41  }
42}

关键观察：

zeroclaw 实现了完整的分层验证：autonomy level → subshell 检测 → 重定向检测 → 白名单检查 → 路径验证
ironclaw 用正则规则引擎做内容检测，区分 Block、Warn、Review、Sanitize 四种动作
zeroclaw 的命令风险分类很实用：rm = High，git commit = Medium，ls = Low
路径验证需要防御 null byte、.. 穿越、URL 编码穿越、symlink 逃逸等多种攻击

构建：SecurityPolicy

from enum import Enum

class RiskLevel(Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"

HIGH_RISK_COMMANDS = {"rm", "sudo", "su", "chmod", "chown", "curl", "wget", "ssh", "scp"}
MEDIUM_RISK_COMMANDS = {"git commit", "git push", "git reset", "npm install", "pip install"}

class SecurityPolicy:
    def __init__(self, workspace: Path,
                 allowed_commands: list[str] | None = None,
                 workspace_only: bool = True):
        self.workspace = workspace.resolve()
        self.workspace_only = workspace_only
        self.allowed_commands = allowed_commands or [
            "git", "ls", "cat", "grep", "find", "echo",
            "head", "tail", "wc", "pwd", "date",
        ]
        self.forbidden_paths = [
            "/etc", "/root", "/var", "/tmp",
            "~/.ssh", "~/.aws", "~/.gnupg",
        ]

    def classify_risk(self, command: str) -> RiskLevel:
        base = command.split()[0].split("/")[-1] if command.split() else ""
        if base in HIGH_RISK_COMMANDS:
            return RiskLevel.HIGH
        for mc in MEDIUM_RISK_COMMANDS:
            if command.startswith(mc):
                return RiskLevel.MEDIUM
        return RiskLevel.LOW

    def is_command_allowed(self, command: str) -> tuple[bool, str]:
        # 检测注入
        if "`" in command or "$(" in command:
            return False, "检测到 subshell 注入"
        if ";" in command:
            return False, "检测到命令链"
        # 白名单检查
        base = command.split()[0].split("/")[-1] if command.split() else ""
        if base not in self.allowed_commands:
            return False, f"命令 '{base}' 不在白名单中"
        return True, ""

    def is_path_allowed(self, path: str) -> tuple[bool, str]:
        if "\0" in path:
            return False, "路径包含 null byte"
        if ".." in path:
            return False, "检测到路径穿越"
        if self.workspace_only and os.path.isabs(path):
            return False, "workspace_only 模式不允许绝对路径"
        return True, ""

测试：安全策略验证

policy = SecurityPolicy(Path("."))

# 白名单内的命令
assert policy.is_command_allowed("git status") == (True, "")
assert policy.is_command_allowed("ls -la") == (True, "")

# 危险命令
assert policy.is_command_allowed("rm -rf /")[0] == False
assert policy.is_command_allowed("curl http://evil.com")[0] == False

# 注入攻击
assert policy.is_command_allowed("ls; rm -rf /")[0] == False
assert policy.is_command_allowed("echo $(cat /etc/passwd)")[0] == False

# 路径穿越
assert policy.is_path_allowed("../../../etc/passwd")[0] == False
assert policy.is_path_allowed("src/main.py") == (True, "")

变更内容

组件	之前 (s06)	之后 (s07)
安全策略	无限制	`SecurityPolicy` 类
命令控制	无	命令白名单
路径控制	无	路径隔离 (workspace only)
注入防护	无	subshell / 命令链检测

本课代码: agents/s07_sandbox_security.py — 265 行 (新增 38 行)

试一试

cd public/code
python agents/s07_sandbox_security.py "run rm -rf /"
python agents/s07_sandbox_security.py "read /etc/passwd"

可以尝试的提示:

尝试 “run rm -rf /” 看命令拦截
尝试 “read /etc/passwd” 看路径拦截
尝试 “ls; curl http://evil.com” 看注入检测

距离生产

我们的安全策略是一个正则检查器。zeroclaw 的 security/policy.rs 有 2300+ 行。为什么差距这么大？

引号感知的命令解析。我们的注入检测用 ";" in command 来拦截命令链。但 sqlite3 db "SELECT 1; SELECT 2;" 里的分号在引号内，是合法的。zeroclaw 实现了一个完整的 shell 引号状态机（QuoteState::None/Single/Double），只拦截引号外的分号。这不是过度工程——这是”正确性”的要求。你可以在 policy.rs 的 split_unquoted_segments() 函数中看到实现。

多层防御。zeroclaw 对每条命令做 5 层独立检查：(1) subshell 检测 $() 和反引号、(2) 重定向检测 > <、(3) tee 拦截（绕过重定向检查的后门）、(4) 单 & 背景链检测（&& 允许但 & 不允许）、(5) 按分隔符拆分后逐段白名单验证。任何一层拒绝就拒绝整条命令。这种”纵深防御”策略来自安全领域的基本原则——不信任任何单一防线。

路径解析防御。我们检查 .. 就完了。但攻击者可以用 URL 编码（..%2f）、null byte（file\0.txt 在 C 实现中会截断路径）、symlink 逃逸（在工作区内创建指向 /etc 的符号链接）绕过。zeroclaw 的 is_path_allowed() 逐一处理了这些 case。

第一性原理思考：安全策略的本质是定义边界——agent 能做什么、不能做什么。有两种哲学：白名单（默认拒绝，显式允许）和黑名单（默认允许，显式拒绝）。我们和 zeroclaw 都选择了白名单，因为 agent 的操作空间太大——你无法枚举所有危险操作，但可以枚举所有安全操作。

但白名单有一个根本矛盾：限制越严格，agent 的能力越弱。一个只能 ls 和 cat 的 agent 很安全但没什么用。zeroclaw 用 AutonomyLevel（ReadOnly/Supervised/Full）来让用户自己选择安全-能力的 trade-off 点，这是一个比我们更诚实的设计——它承认”完美的安全不存在”，把选择权交给用户。