📄 中文摘要
长上下文的LLM和检索增强生成(RAG)系统在处理信息时表现出被动,状态跟踪、矛盾解决和证据聚合等操作在查询时进行,这在超长流和频繁更新的情况下显得脆弱。提出了统一记忆代理(UMA),这是一个端到端的强化学习框架,将记忆操作和问答统一在单一策略中。UMA维持双重记忆表示:一个紧凑的核心摘要用于全局上下文,以及一个支持显式CRUD(创建、更新、删除、重组)操作的结构化记忆库,能够在流媒体过程中主动整合信息。为了评估长时间记忆行为,引入了Ledger-QA。
📄 English Summary
Learning to Remember: End-to-End Training of Memory Agents for Long-Context Reasoning
Long-context LLMs and Retrieval-Augmented Generation (RAG) systems exhibit passive information processing, deferring state tracking, contradiction resolution, and evidence aggregation to query time, which becomes fragile under ultra-long streams with frequent updates. The Unified Memory Agent (UMA) is proposed as an end-to-end reinforcement learning framework that integrates memory operations and question answering within a single policy. UMA maintains a dual memory representation: a compact core summary for global context and a structured Memory Bank that supports explicit CRUD (create, update, delete, reorganize) operations over key-value entries, enabling proactive consolidation during streaming. To evaluate long-horizon memory behavior, Ledger-QA is introduced.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等