学习记忆：端到端训练的记忆代理用于长上下文推理

出处: Learning to Remember: End-to-End Training of Memory Agents for Long-Context Reasoning

发布: 2026年2月24日

📄 中文摘要

长上下文的LLM和检索增强生成（RAG）系统在处理信息时表现出被动，状态跟踪、矛盾解决和证据聚合等操作在查询时进行，这在超长流和频繁更新的情况下显得脆弱。提出了统一记忆代理（UMA），这是一个端到端的强化学习框架，将记忆操作和问答统一在单一策略中。UMA维持双重记忆表示：一个紧凑的核心摘要用于全局上下文，以及一个支持显式CRUD（创建、更新、删除、重组）操作的结构化记忆库，能够在流媒体过程中主动整合信息。为了评估长时间记忆行为，引入了Ledger-QA。

🏷️ 相关标签

#长上下文 #记忆代理 #强化学习 #信息处理

📄 English Summary

Learning to Remember: End-to-End Training of Memory Agents for Long-Context Reasoning

Long-context LLMs and Retrieval-Augmented Generation (RAG) systems exhibit passive information processing, deferring state tracking, contradiction resolution, and evidence aggregation to query time, which becomes fragile under ultra-long streams with frequent updates. The Unified Memory Agent (UMA) is proposed as an end-to-end reinforcement learning framework that integrates memory operations and question answering within a single policy. UMA maintains a dual memory representation: a compact core summary for global context and a structured Memory Bank that supports explicit CRUD (create, update, delete, reorganize) operations over key-value entries, enabling proactive consolidation during streaming. To evaluate long-horizon memory behavior, Ledger-QA is introduced.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

Learning to Remember: End-to-End Training of Memory Agents for Long-Context Reasoning

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误