大型语言模型中的潜在语义流形

出处: Latent Semantic Manifolds in Large Language Models

发布: 2026年3月25日

📄 中文摘要

大型语言模型（LLMs）在连续向量空间中进行内部计算，但最终生成离散的标记，这一基本不匹配的几何后果尚不清楚。研究提出了一种数学框架，将LLM的隐藏状态解释为潜在语义流形上的点：一个配备费舍尔信息度量的黎曼子流形，其中标记对应于划分流形的Voronoi区域。定义了表达能力差距，这是一种从词汇离散化引起的语义失真几何度量，并证明了两个定理：任何有限词汇的失真下界，以及通过共面积公式得出的表达能力差距的线性体积缩放法则。研究结果验证了这一理论框架的有效性。

🏷️ 相关标签

#大型语言模型 #潜在语义流形 #费舍尔信息度量 #表达能力差距 #几何失真

📄 English Summary

Latent Semantic Manifolds in Large Language Models

Large Language Models (LLMs) operate in continuous vector spaces for internal computations while generating discrete tokens, highlighting a fundamental mismatch whose geometric implications are not well understood. A mathematical framework is developed to interpret the hidden states of LLMs as points on a latent semantic manifold: a Riemannian submanifold equipped with the Fisher information metric, where tokens correspond to Voronoi regions that partition the manifold. The concept of the expressibility gap is introduced as a geometric measure of semantic distortion due to vocabulary discretization. Two theorems are proven: a rate-distortion lower bound for any finite vocabulary, and a linear volume scaling law for the expressibility gap via the coarea formula. The validity of this theoretical framework is confirmed through empirical validation.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

Latent Semantic Manifolds in Large Language Models

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误