先思考,快速扩散:通过自回归计划条件改善扩散语言模型推理
📄 中文摘要
扩散大型语言模型(dLLMs)通过迭代去噪生成文本,但在多步骤推理上表现不佳。研究假设这一差距源于协调问题:自回归(AR)模型逐步构建一致性,而扩散模型必须同时协调所有位置。提出了一种计划条件的方法,该方法在扩散模型的提示前添加一个来自自回归模型的简短自然语言计划(约100个标记)。该计划作为一个固定的支架,提供了一个从第一去噪步骤起所有标记位置都可见的全局上下文。在GSM8K数据集上,计划条件使LLaDA-8B-Instruct的准确率从75.6%提升至87.2%(增加11.6个百分点),与同规模的自回归模型(LLaMA 3.1 8B,87.7%)相匹配。
📄 English Summary
Think First, Diffuse Fast: Improving Diffusion Language Model Reasoning via Autoregressive Plan Conditioning
Diffusion large language models (dLLMs) generate text through iterative denoising but consistently underperform in multi-step reasoning tasks. This study hypothesizes that the performance gap arises from a coordination issue: autoregressive (AR) models build coherence token-by-token, while diffusion models must coordinate all positions simultaneously. A method called plan conditioning is proposed, which prepends a short (~100-token) natural-language plan from an AR model to the prompt of the diffusion model. This plan acts as a frozen scaffold, providing globally visible context that every token position can attend to from the first denoising step. On the GSM8K dataset, plan conditioning improves the performance of LLaDA-8B-Instruct from 75.6% to 87.2% (+11.6 percentage points), matching the performance of a same-size AR model (LLaMA 3.1 8B, 87.7%).
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等