基于大型语言模型的网络代理的人工智能规划框架

出处: AI Planning Framework for LLM-Based Web Agents

发布: 2026年3月16日

📄 中文摘要

开发用于网络任务的自主代理是人工智能领域的核心挑战之一。大型语言模型(LLM)代理能够理解复杂的用户请求,但通常作为黑箱操作,难以诊断其失败原因及规划过程。该研究将网络任务正式视为顺序决策过程,提出了一种分类法,将现代代理架构映射到传统规划范式:逐步代理对应广度优先搜索(BFS),树搜索代理对应最佳优先树搜索,提前全盘计划代理对应深度优先搜索(DFS)。这一框架为系统故障的原则性诊断提供了可能,能够有效识别上下文漂移和任务分解不一致等问题。

📄 English Summary

AI Planning Framework for LLM-Based Web Agents

Developing autonomous agents for web-based tasks represents a core challenge in the field of artificial intelligence. While Large Language Model (LLM) agents can interpret complex user requests, they often function as black boxes, making it difficult to diagnose the reasons for their failures or understand their planning processes. This research formally treats web tasks as sequential decision-making processes and introduces a taxonomy that maps modern agent architectures to traditional planning paradigms: Step-by-Step agents correspond to Breadth-First Search (BFS), Tree Search agents correspond to Best-First Tree Search, and Full-Plan-in-Advance agents correspond to Depth-First Search (DFS). This framework enables principled diagnosis of system failures, allowing for the identification of issues such as context drift and incoherent task decomposition.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等