GroundedPlanBench: 空间基础的长时间任务规划用于机器人操作
📄 中文摘要
该研究提出了一种新的基准测试框架GroundedPlanBench,旨在解决机器人操作中的长时间任务规划问题。现有的视觉语言模型(VLMs)在生成机器人动作计划时,通常将决策过程分为两个步骤:首先生成自然语言计划,然后将其翻译为可执行的动作。这种方法在实际应用中常常导致决策的失效。GroundedPlanBench通过引入空间信息,增强了VLMs在复杂环境中的任务规划能力,能够更有效地处理长时间任务,提升机器人在动态场景中的操作效率和准确性。该框架为未来的研究提供了一个重要的参考标准。
📄 English Summary
GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation
The study introduces a new benchmark framework, GroundedPlanBench, aimed at addressing long-horizon task planning challenges in robot manipulation. Existing vision-language models (VLMs) typically separate the decision-making process into two steps: generating a natural language plan and translating it into executable actions. This approach often leads to failures in decision-making during practical applications. GroundedPlanBench enhances VLMs' task planning capabilities by incorporating spatial information, enabling more effective handling of long-horizon tasks in complex environments. This improvement boosts the efficiency and accuracy of robots operating in dynamic scenarios, providing a significant reference standard for future research.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等