📄 中文摘要
大型语言模型(LLMs)已经超越了简单的文本续写,能够编写代码、分析数据、规划行动并进行有意义的对话。在许多任务中,它们的推理看起来出奇地结构化,但这一点很少被直接提及:LLMs中的推理结构并不是模型架构的一部分,而是大规模训练和上下文学习的副产品。在神经核心和代理/工具层之间,可以解释为一种中间推理层——隐式、非正式,并源于变换器的行为。尽管这一层的工作效果良好,但并不受控,也无法保证可重复性。考虑这一层的存在是有意义的。
📄 English Summary
Breaking the Black Box: Why LLMs May Need an Explicit Reasoning Layer
Large language models (LLMs) have evolved beyond simple text continuation, now capable of writing code, analyzing data, planning actions, and engaging in meaningful conversations. Their reasoning often appears surprisingly structured, yet an important detail is seldom mentioned: the structure of reasoning in LLMs is not an inherent part of the model's architecture but rather emerges as a byproduct of large-scale training and in-context learning. Between the neural core and the agent/tool layer, there exists an intermediate reasoning level that is implicit, unformalized, and arises from transformer behavior. While this level functions effectively, it is not controlled or guaranteed to be reproducible. Recognizing the significance of this level is essential.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等