遍历即策略:作为外部可验证政策的日志提炼门控行为树,用于安全、稳健和高效的智能体

📄 中文摘要

自主大型语言模型(LLM)智能体在长时间策略上存在隐含性,导致其在执行任务时的安全性往往是事后补救。提出了一种名为“遍历即策略”的方法,将沙箱环境中的OpenHands执行日志提炼为单一可执行的门控行为树(GBT)。在任务覆盖范围内,树的遍历被视为控制策略,而不是不受限制的生成。每个节点编码了从成功轨迹中提取的状态条件动作宏,并经过合并检查;与不安全轨迹相关的宏会在结构化工具上下文和有限历史上附加确定性的预执行门,确保在经验基础上更新,防止之前被拒绝的不安全上下文重新被接纳。该方法在运行时提供了轻量级的控制机制,增强了智能体的安全性和效率。

📄 English Summary

Traversal-as-Policy: Log-Distilled Gated Behavior Trees as Externalized, Verifiable Policies for Safe, Robust, and Efficient Agents

Autonomous large language model (LLM) agents often fail due to the implicit nature of long-horizon policies embedded in model weights and transcripts, leading to safety measures being applied retroactively. The proposed method, Traversal-as-Policy, distills sandboxed OpenHands execution logs into a single executable Gated Behavior Tree (GBT). Within the coverage of a task, tree traversal is treated as the control policy instead of unconstrained generation. Each node encodes a state-conditioned action macro mined and merged from successful trajectories, with macros linked to unsafe traces attaching deterministic pre-execution gates over structured tool contexts and bounded histories. This ensures that previously rejected unsafe contexts cannot be re-admitted, updated under experience-grounded monotonicity. The approach provides a lightweight control mechanism at runtime, enhancing the safety and efficiency of agents.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等