Panini：通过结构化记忆在标记空间中的持续学习

出处: Panini: Continual Learning in Token Space via Structured Memory

发布: 2026年2月18日

📄 中文摘要

随着语言模型在处理未经过训练的内容（如新文档、不断演变的知识和用户特定数据）方面的应用日益增多，检索增强生成（RAG）成为一种常见的方法。该方法将文档以块的形式存储在外部，并在推理时仅检索相关子集供大语言模型（LLM）进行推理。然而，这种方式导致测试时计算资源的低效利用，因为LLM会重复对相同文档进行推理；此外，块检索可能引入无关的上下文，从而增加不支持生成的风险。为了解决这些问题，提出了一种类似人类的非参数持续学习框架，其中基础模型保持不变，而学习通过将每个新经验整合到外部语义记忆中进行。该框架旨在提高模型对新信息的适应能力和推理效率。

🏷️ 相关标签

#持续学习 #语言模型 #检索增强生成 #语义记忆 #推理

📄 English Summary

Panini: Continual Learning in Token Space via Structured Memory

Language models are increasingly applied to reason over content they were not originally trained on, including new documents, evolving knowledge, and user-specific data. A prevalent method is retrieval-augmented generation (RAG), which stores documents externally as chunks and retrieves only relevant subsets at inference time for large language models (LLMs) to reason over. However, this approach leads to inefficient use of test-time compute, as LLMs repeatedly reason over the same documents. Additionally, chunk retrieval can introduce irrelevant context, increasing the risk of unsupported generation. To address these issues, a human-like non-parametric continual learning framework is proposed, where the base model remains fixed and learning occurs by integrating each new experience into an external semantic memory. This framework aims to enhance the model's adaptability to new information and improve reasoning efficiency.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

Panini: Continual Learning in Token Space via Structured Memory

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误