s09: Plan Mode — learn-openclaw

动机：为什么需要 Plan Mode

到目前为止，agent 收到任务后立即开始执行——看到什么做什么。对于简单任务这没问题，但对于复杂任务（“重构这个模块""迁移数据库 schema”），直接动手往往会走弯路。

Plan Mode 让 agent 在执行前先制定计划：

分析任务，列出要做的步骤
把计划展示给用户审批
审批通过后再按计划执行

Anthropic 在 Building Effective Agents 中把这叫做 Evaluator-Optimizer pattern：Plan → Execute → Evaluate → Iterate。OpenClaw 有一个专门的 Plan Mode，配合 TodoWrite 工具追踪进度。

Ground Truth：真实的 Plan Mode

Plan Mode——跨代码库对比

计划制定和任务追踪在不同实现中的方式

pythonBuilding Effective Agents (概念)

1# Evaluator-Optimizer Pattern
2# Anthropic 的 5 种 workflow pattern 之一
3 
4def evaluator_optimizer(task: str):
5  plan = llm.generate_plan(task)
6 
7  while True:
8      result = llm.execute_plan(plan)
9 
10      evaluation = llm.evaluate(result, task)
11 
12      if evaluation.is_satisfactory:
13          return result
14 
15      # 根据评估调整计划
16      plan = llm.revise_plan(
17          plan, evaluation.feedback
18      )
19 
20# 关键洞察:
21# - 有明确的评估标准时最有效
22# - 迭代改进比一次做对更可靠
23# - 透明展示 planning 步骤给用户

typescriptsrc/agents/ (tool catalog)

1// OpenClaw 的工具系统 (开源 TypeScript)
2// src/agents/tool-catalog.ts 定义了完整工具列表
3// 其中包含 TodoWrite 风格的任务管理
4 
5// src/agents/tools/common.ts: 工具基础类型
6interface AnyAgentTool {
7  name: string;
8  description: string;
9  execute: (...args) => Promise<unknown>;
10  ownerOnly?: boolean;  // 仅 owner 可用
11}
12 
13// src/agents/pi-tools.ts:
14// createOpenClawCodingTools() 注册所有工具
15// 包括文件操作、bash、web 搜索、
16// memory、cron、gateway 等
17 
18// src/agents/tools/ 目录包含:
19// - browser-tool.ts    浏览器操作
20// - memory-tool.ts     记忆管理
21// - cron-tool.ts       定时任务
22// - sessions-spawn-tool.ts  子 session
23// - web-fetch.ts / web-search.ts
24// - image-tool.ts      图片处理
25 
26// 权限通过 tool-policy.ts 控制:
27// allow/deny 列表 + ownerOnly 标记

关键观察：

Anthropic 的 Evaluator-Optimizer 有明确的循环：Plan → Execute → Evaluate → Revise
OpenClaw 用 TodoWrite 做任务追踪——每个 todo 有 id、content、status
TodoWrite 在 system prompt 中被提及三次——“重复驱动可靠性”
Plan 审批让用户在大量操作前有机会审查

构建：Plan Mode 与 TodoWrite

@dataclass
class Todo:
    id: str
    content: str
    status: str = "pending"  # pending | in_progress | completed | cancelled

class PlanManager:
    """管理 agent 的任务计划"""

    def __init__(self):
        self.todos: list[Todo] = []

    def add(self, content: str) -> Todo:
        todo = Todo(id=str(len(self.todos) + 1), content=content)
        self.todos.append(todo)
        return todo

    def update(self, todo_id: str, status: str):
        for t in self.todos:
            if t.id == todo_id:
                t.status = status
                break

    def get_pending(self) -> list[Todo]:
        return [t for t in self.todos if t.status == "pending"]

    def summary(self) -> str:
        lines = []
        for t in self.todos:
            marker = {"pending": "[ ]", "in_progress": "[>]",
                      "completed": "[x]", "cancelled": "[-]"}
            lines.append(f"{marker.get(t.status, '[ ]')} {t.id}. {t.content}")
        return "\n".join(lines)

TodoWrite 作为一个 Tool：

class TodoWriteTool(Tool):
    name = "todo_write"
    description = "创建或更新任务计划。在开始复杂任务前使用。"
    parameters = {
        "type": "object",
        "properties": {
            "todos": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "id": {"type": "string"},
                        "content": {"type": "string"},
                        "status": {"type": "string",
                                   "enum": ["pending", "in_progress",
                                            "completed", "cancelled"]},
                    },
                    "required": ["id", "content", "status"],
                },
            },
        },
        "required": ["todos"],
    }

    def __init__(self, plan_manager: PlanManager):
        self.pm = plan_manager

    def execute(self, todos: list[dict]) -> str:
        for t in todos:
            existing = next((x for x in self.pm.todos if x.id == t["id"]), None)
            if existing:
                existing.status = t["status"]
                existing.content = t.get("content", existing.content)
            else:
                self.pm.add(t["content"])
        return f"已更新 {len(todos)} 个任务\n{self.pm.summary()}"

变更内容

组件	之前 (s08)	之后 (s09)
执行模式	直接执行	`PlanManager` 先计划后执行
任务追踪	无	`Todo` 类 (pending/in_progress/completed)
工具	无	`TodoWriteTool` 创建和更新计划

本课代码: agents/s09_plan_mode.py — 303 行 (新增 0 行，重构了已有代码)

试一试

cd public/code
python agents/s09_plan_mode.py "Refactor my project: first create a plan"

可以尝试的提示:

“Refactor my project: first create a plan”
“帮我设计一个 REST API，先列出步骤”
观察 agent 如何用 TodoWrite 追踪任务进度

距离生产

计划即工具调用: OpenClaw 的洞察

OpenClaw 的 tool catalog 里，TodoWrite 和 read_file、bash 并列——它就是一个普通的工具。这揭示了一个重要设计选择：计划不是一个独立的”模式”，而是一次工具调用。agent 不需要切换到”计划模式”再切回”执行模式”——它可以在执行过程中随时调用 TodoWrite 来更新计划。这比显式的模式切换更灵活。

Evaluator-Optimizer: Anthropic 的迭代模式

Anthropic 在 Building Effective Agents 中描述的 Evaluator-Optimizer pattern 是 Plan Mode 的理论基础：Plan → Execute → Evaluate → Revise。关键不是”先计划”这件事本身，而是评估和迭代——执行完后检查结果是否符合预期，不符合则修改计划重来。我们的实现缺少 Evaluate 和 Revise 步骤。

其他差距

无用户审批流——生产系统在执行前展示计划让用户确认
无计划持久化——重启后计划丢失，生产系统将计划存入 session

第一性原理思考

“先想后做” 是否真的更好？对于简单任务（“把这个变量名改一下”），计划反而浪费时间和 token。Plan mode 的价值在于复杂任务——当错误代价很高（重构大型模块）、或者需要用户审批（修改配置文件）时。这暗示了一个重要的设计选择：agent 需要自己判断何时启用 plan mode。OpenClaw 在 system prompt 中反复提及 TodoWrite（三次），但从不强制使用——让模型根据任务复杂度自行决定。这比硬编码 “所有任务都要先计划” 更合理。