📄 中文摘要
随着计算机视觉的进步,能够从视觉惯性数据中自动提取以对象为中心的关系表示。这些状态表示被称为 3D 场景图,是对现实世界场景的层次化分解,具有密集的多重图结构。尽管 3D 场景图声称能够促进机器人系统的高效任务规划,但它们包含大量对象和关系,而在特定任务中仅需要小部分。这种情况扩大了任务规划者必须操作的状态空间,限制了在资源受限环境中的部署。该研究测试了现有的具身人工智能环境在机器人任务规划与 3D 场景图交叉研究中的适用性。
📄 English Summary
Contextual Graph Representations for Task-Driven 3D Perception and Planning
Recent advancements in computer vision enable the fully automatic extraction of object-centric relational representations from visual-inertial data. These state representations, referred to as 3D scene graphs, provide a hierarchical decomposition of real-world scenes with a dense multiplex graph structure. While 3D scene graphs are claimed to enhance efficient task planning for robotic systems, they often contain numerous objects and relations, whereas only small subsets are necessary for specific tasks. This situation amplifies the state space that task planners must navigate, hindering deployment in resource-constrained environments. The research evaluates the suitability of existing embodied AI environments for investigating the intersection of robotic task planning and 3D scene graphs.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等