Indic-TunedLens：印度语言中的多语言模型解释

出处: Indic-TunedLens: Interpreting Multilingual Models in Indian Languages

发布: 2026年2月18日

📄 中文摘要

多语言大型语言模型（LLMs）在印度等语言多样化地区的应用日益增加，但大多数解释工具仍然以英语为中心。已有研究表明，LLMs通常在以英语为中心的表示空间中运作，这使得跨语言解释成为一个紧迫的问题。Indic-TunedLens是一个专为印度语言设计的全新解释框架，通过学习共享的仿射变换来实现。与标准的Logit Lens直接解码中间激活不同，Indic-TunedLens为每种目标语言调整隐藏状态，使其与目标输出分布对齐，从而更忠实地解码模型表示。该框架在10种印度语言上进行了评估，显示出其有效性。

🏷️ 相关标签

#多语言模型 #印度语言 #解释性工具 #跨语言解释 #仿射变换

📄 English Summary

Indic-TunedLens: Interpreting Multilingual Models in Indian Languages

Multilingual large language models (LLMs) are increasingly utilized in linguistically diverse regions such as India, yet most interpretability tools remain focused on English. Prior studies indicate that LLMs often function within English-centric representation spaces, highlighting the urgent need for cross-lingual interpretability. Indic-TunedLens is introduced as a novel interpretability framework specifically designed for Indian languages, learning shared affine transformations. Unlike the standard Logit Lens, which directly decodes intermediate activations, Indic-TunedLens adjusts hidden states for each target language, aligning them with target output distributions to facilitate more faithful decoding of model representations. The framework has been evaluated across 10 Indian languages, demonstrating its effectiveness.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

Indic-TunedLens: Interpreting Multilingual Models in Indian Languages

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误