AsgardBench:一个视觉基础的互动规划基准

📄 中文摘要

AsgardBench 是一个专为视觉基础的互动规划任务设计的基准,旨在评估机器人在复杂环境中进行决策和调整的能力。以厨房清洁为例,机器人需要观察周围环境,判断清洗目标是否需要处理,并在遇到意外情况时进行调整,例如当目标物品已经干净或水槽中有其他物品时。该基准为研究人员提供了一个标准化的平台,以测试和比较不同的机器人系统在动态环境中的表现,推动了具身人工智能的发展。

📄 English Summary

AsgardBench: A benchmark for visually grounded interactive planning

AsgardBench is a benchmark designed for visually grounded interactive planning tasks, aimed at evaluating a robot's ability to make decisions and adapt in complex environments. Taking the example of cleaning a kitchen, the robot needs to observe its surroundings, determine whether the target item requires cleaning, and adjust its actions when unexpected situations arise, such as when the item is already clean or the sink is occupied with other items. This benchmark provides researchers with a standardized platform to test and compare the performance of different robotic systems in dynamic environments, advancing the field of embodied AI.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等