80% 的大型语言模型“思考”是谎言——链式推理忠实性研究实际展示的内容

出处: 80% of LLM 'Thinking' Is a Lie — What CoT Faithfulness Research Actually Shows

发布: 2026年3月29日

📄 中文摘要

当前，许多大型语言模型（LLM）如DeepSeek-R1、Claude 3.7 Sonnet和Qwen3.5等，展示了其推理过程。然而，尽管这些模型在输出中表现出自我反思和辩论的迹象，实际上它们的“思考”并不如表面所示。对链式推理（CoT）轨迹的分析表明，用户所看到的并不是模型真实的推理记录，而是生成的文本，旨在模拟推理过程。这一发现揭示了模型输出的局限性，挑战了人们对其思维能力的信任。

🏷️ 相关标签

#大型语言模型 #链式推理 #推理过程 #思维能力 #文本生成

📄 English Summary

80% of LLM 'Thinking' Is a Lie — What CoT Faithfulness Research Actually Shows

Many large language models (LLMs) such as DeepSeek-R1, Claude 3.7 Sonnet, and Qwen3.5 are now showcasing their reasoning processes. However, despite appearing to engage in self-reflection and debate, the 'thinking' exhibited by these models is misleading. Analysis of chain-of-thought (CoT) traces indicates that what users perceive is not a genuine record of reasoning but rather generated text designed to mimic reasoning processes. This finding reveals the limitations of model outputs and challenges the trust in their cognitive abilities.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

80% of LLM 'Thinking' Is a Lie — What CoT Faithfulness Research Actually Shows

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误