📄 中文摘要
在谈判领域,通常被视为逻辑的竞技场,而非艺术或直觉。然而,大型语言模型(LLMs)在这一领域仍然面临挑战,主要由于其战略深度有限以及难以适应复杂的人类因素。目前的基准测试很少能够捕捉到这一局限性。为了解决这一问题,提出了一种以效用反馈为中心的框架。该研究的贡献包括:构建了AgoraBench,一个涵盖九种具有挑战性的设置(如欺骗、垄断)的新基准,以支持多样化的策略建模;以及基于效用理论的人类对齐经济指标。这些指标通过代理效用、谈判权力和获取比率来实现,隐含地衡量谈判与人类偏好的契合程度。
📄 English Summary
MERIT Feedback Elicits Better Bargaining in LLM Negotiators
Bargaining is often perceived as a logical arena rather than an art or intuition. However, Large Language Models (LLMs) struggle in this domain due to limited strategic depth and challenges in adapting to complex human factors. Current benchmarks rarely capture these limitations. To address this gap, a utility feedback-centric framework is proposed. Key contributions include the development of AgoraBench, a new benchmark encompassing nine challenging scenarios (e.g., deception, monopoly) that supports diverse strategy modeling, and the introduction of human-aligned, economically grounded metrics derived from utility theory. These metrics are operationalized through agent utility, negotiation power, and acquisition ratio, which implicitly measure the alignment of negotiations with human preferences.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等