Mercury 2与自回归垄断的终结:扩散LLM对生产代理堆栈的影响
📄 中文摘要
Mercury 2于2026年2月25日由Inception Labs推出,采用扩散架构,改变了传统的自回归模型生成方式。与逐字生成不同,Mercury 2能够并行优化整个段落,通过迭代改进草稿。这一方法与Stable Diffusion和Midjourney在图像生成中的应用相似,现已扩展至语言和推理领域。Mercury 2的处理速度超过1000个令牌每秒,约为最快自回归模型的五倍,标志着生产级语言模型的重大进步。
📄 English Summary
Mercury 2 and the End of Autoregressive Monopoly: What Diffusion LLMs Mean for Production Agent Stacks
Mercury 2, launched by Inception Labs on February 25, 2026, employs a diffusion architecture that fundamentally alters the traditional autoregressive model generation process. Unlike sequential token generation, Mercury 2 refines entire passages in parallel, iteratively improving drafts. This approach mirrors the techniques used in Stable Diffusion and Midjourney for image generation, now applied to language and reasoning. With a processing speed exceeding 1,000 tokens per second, Mercury 2 is approximately five times faster than the fastest autoregressive models, representing a significant advancement in production-level language models.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等