超越掩码：通过删除-插入过程实现高效灵活的扩散语言模型

出处: Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes

发布: 2026年3月26日

📄 中文摘要

提出了一种新的删除-插入扩散语言模型（DID），该模型将令牌的删除和插入严格地形式化为离散扩散过程，从而替代当前掩码扩散语言模型（MDLMs）中的掩码和解掩码过程。DID通过消除MDLMs中两个主要的计算开销来源，显著提高了训练和推理效率：一是消除了与非信息性<MASK>令牌相关的计算，二是消除了在可变长度设置中引入的<PAD>令牌。此外，DID还提供了更大的生成灵活性，使其在语言建模任务中具有更广泛的应用潜力。

🏷️ 相关标签

#扩散语言模型 #删除-插入过程 #计算效率 #生成灵活性

📄 English Summary

Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes

A novel Deletion-Insertion Diffusion language model (DID) is proposed, rigorously formulating token deletion and insertion as discrete diffusion processes, replacing the masking and unmasking processes in current Masked Diffusion Language Models (MDLMs). DID significantly improves training and inference efficiency by eliminating two major sources of computational overhead in MDLMs: the computations associated with non-informative <MASK> tokens and the <PAD> tokens introduced in variable-length settings. Furthermore, DID offers greater generation flexibility, enhancing its applicability in various language modeling tasks.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误