大型语言模型是否具备心智理论?基于奇异故事范式的比较评估

📄 中文摘要

该研究旨在评估当前大型语言模型(LLMs)是否具备心智理论(ToM)能力,特别是从文本中推断他人信念、意图和情感的能力。由于LLMs是在没有社会体现或其他心理表征表现的情况下训练的,其表面上的社会认知推理引发了关于其理解本质的关键问题。研究通过使用改编的文本工具,对五种LLMs进行测试,并将其表现与人类控制组进行比较,以评估其在心理状态归属方面的能力是否与人类相当,或其输出是否仅反映表面的模式完成。该研究为理解LLMs的认知能力提供了新的视角。

📄 English Summary

Do Large Language Models Possess a Theory of Mind? A Comparative Evaluation Using the Strange Stories Paradigm

This study evaluates whether current Large Language Models (LLMs) possess Theory of Mind (ToM) capabilities, specifically the ability to infer others' beliefs, intentions, and emotions from text. Given that LLMs are trained without social embodiment or access to other manifestations of mental representations, their apparent social-cognitive reasoning raises critical questions about the nature of their understanding. The research tested five LLMs and compared their performance to that of human controls using an adapted text-based tool to assess whether their mental-state attribution capabilities are comparable to those of humans or merely reflect superficial pattern completion. This study provides new insights into the cognitive abilities of LLMs.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等