超越 RAG:构建递归语言模型以处理 100 万个标记

📄 中文摘要

在处理包含一百万个标记的文本时,模型的上下文窗口仅为 128K。常见的解决方案包括 RAG(将文本分块、嵌入并检索相关片段)或长上下文模型(希望窗口足够大)。然而,这两者都有根本性的权衡:RAG 由于仅检索片段而失去全局上下文,而长上下文模型在输入长度增加时质量下降,出现著名的“迷失在中间”问题。最近一篇来自 arXiv 的论文提出了一种第三种方法:递归语言模型(RLM)。其核心思想是让 LLM 自行编程以访问文档。作者构建了一个工作原型,展示了这一方法的实现过程。

📄 English Summary

Beyond RAG: Building a Recursive Language Model to Process 1M Tokens

When faced with a million tokens of text and a model context window of only 128K, common solutions such as RAG (chunking, embedding, and retrieving relevant pieces) or long-context models (hoping the window is large enough) present fundamental trade-offs. RAG loses global context by retrieving only fragments, while long-context models suffer from quality degradation as input length increases, leading to the well-known 'lost in the middle' problem. A recent paper from arXiv proposes a third approach: Recursive Language Models (RLM). The concept is straightforward—allow the LLM to program its own access to the document. The author has built a working prototype to demonstrate this method's implementation.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等