BeSafe-Bench:揭示功能环境中处于状态的智能体的行为安全风险
📄 中文摘要
随着大型多模态模型(LMMs)的快速发展,智能体能够执行复杂的数字和物理任务。然而,作为自主决策者的部署引入了显著的无意行为安全风险。缺乏全面的安全基准成为一个主要瓶颈,因为现有评估依赖于低保真度环境、模拟API或狭窄范围的任务。为了解决这一问题,提出了BeSafe-Bench(BSB),这是一个用于揭示处于状态的智能体在功能环境中行为安全风险的基准,涵盖了四个代表性领域:Web、移动、具身视觉语言模型(VLM)和具身视觉语言代理(VLA)。通过功能环境,构建了一个多样化的指令空间,增强了九个任务的复杂性。
📄 English Summary
BeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional Environments
The rapid evolution of Large Multimodal Models (LMMs) has enabled agents to perform complex digital and physical tasks. However, their deployment as autonomous decision-makers introduces significant unintentional behavioral safety risks. The lack of a comprehensive safety benchmark remains a major bottleneck, as existing evaluations rely on low-fidelity environments, simulated APIs, or narrowly scoped tasks. To address this gap, BeSafe-Bench (BSB) is proposed as a benchmark for exposing behavioral safety risks of situated agents in functional environments, covering four representative domains: Web, Mobile, Embodied VLM, and Embodied VLA. Using functional environments, a diverse instruction space is constructed by augmenting tasks with nine different complexities.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等