上下文结构重塑语言模型表示几何

出处: Context Structure Reshapes the Representational Geometry of Language Models

发布: 2026年2月3日

📄 中文摘要

大型语言模型（LLMs）在深层中将输入序列的表示组织成更直的神经轨迹，这一现象被推测有助于通过线性外推法进行下一词预测。语言模型还能够适应各种任务并在上下文中学习新结构，近期研究表明这种情境学习（ICL）可以反映在表示变化中。本研究旨在深入探讨上下文结构如何影响语言模型的表示几何。通过分析不同上下文设置下模型内部表征的变化，发现上下文的组织方式，例如信息呈现的顺序、相关信息的聚类或对比，能够显著改变神经轨迹的线性和分离度。具体而言，当上下文提供明确的结构化指导时，模型倾向于形成更紧凑、更易于区分的表示簇，从而可能提高模型对新输入进行泛化的能力。

🏷️ 相关标签

#大型语言模型 #上下文学习 #表示几何 #神经轨迹 #提示工程

📄 English Summary

Context Structure Reshapes the Representational Geometry of Language Models

Large Language Models (LLMs) organize input sequence representations into straighter neural trajectories in their deep layers, a phenomenon hypothesized to facilitate next-token prediction via linear extrapolation. Language models also adapt to diverse tasks and learn new structure in context, with recent work showing that this in-context learning (ICL) can be reflected in representational changes. This research aims to deeply explore how context structure influences the representational geometry of language models. By analyzing changes in the model's internal representations under various contextual settings, it is found that the organization of context, such as the order of information presentation, the clustering or contrast of related information, can significantly alter the linearity and separability of neural trajectories. Specifically, when context provides clear structured guidance, models tend to form more compact and distinguishable representation clusters, potentially enhancing the model's ability to generalize to new inputs. Furthermore, certain types of context structures are found to effectively reduce the dimensionality of the representation space, allowing the model to maintain higher computational efficiency when processing complex information. These findings not only offer new perspectives for understanding the internal working mechanisms of LLMs but also provide theoretical foundations for designing more efficient and robust in-context learning strategies. Through quantitative analysis of changes in representational geometry, prompt engineering techniques can be further optimized to better guide model learning and reasoning.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

Context Structure Reshapes the Representational Geometry of Language Models

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误