检索头是动态的

出处: Retrieval Heads are Dynamic

发布: 2026年2月13日

📄 中文摘要

近期研究发现,大型语言模型(LLMs)中存在“检索头”,负责从输入上下文中提取信息。然而,以往的研究主要依赖于跨数据集的静态统计,识别出在平均水平上执行检索的头部。这种视角忽视了自回归生成过程中的细粒度时间动态。通过广泛的分析,确立了三个核心观点:第一,动态性:检索头在不同时间步上表现出动态变化;第二,不可替代性:动态检索头在每个时间步上都是特定的,无法被静态检索头有效替代;第三,相关性:模型的隐藏状态与检索头的动态变化密切相关。

📄 English Summary

Retrieval Heads are Dynamic

Recent studies have identified 'retrieval heads' in Large Language Models (LLMs) that are responsible for extracting information from input contexts. Previous works primarily relied on static statistics aggregated across datasets, identifying heads that perform retrieval on average. This perspective overlooks the fine-grained temporal dynamics of autoregressive generation. Through extensive analysis, three core claims are established: (1) Dynamism: Retrieval heads vary dynamically across timesteps; (2) Irreplaceability: Dynamic retrieval heads are specific at each timestep and cannot be effectively replaced by static retrieval heads; and (3) Correlation: The model's hidden state is closely related to the dynamic changes of retrieval heads.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等