OTPrune：通过最优传输实现分布对齐的视觉令牌剪枝

出处: OTPrune: Distribution-Aligned Visual Token Pruning via Optimal Transport

发布: 2026年2月25日

📄 中文摘要

OTPrune是一个无训练框架，将视觉令牌剪枝问题表述为通过最优传输实现的分布对齐。该方法通过最小化全量和剪枝后令牌分布之间的2-瓦瑟斯坦距离，能够在降低推理成本的同时保持局部多样性和全局代表性。OTPrune还推导出一个可处理的子模目标，便于高效优化，并在理论上证明了其有效性。该框架为多模态大语言模型（MLLMs）在视觉语言推理中的应用提供了新的思路，尤其是在处理冗余视觉令牌时，展现出显著的优势。

🏷️ 相关标签

#视觉令牌剪枝 #最优传输 #多模态大语言模型 #推理成本 #分布对齐

📄 English Summary

OTPrune: Distribution-Aligned Visual Token Pruning via Optimal Transport

OTPrune is a training-free framework that formulates visual token pruning as distribution alignment via optimal transport (OT). By minimizing the 2-Wasserstein distance between the full and pruned token distributions, OTPrune effectively reduces inference costs while preserving local diversity and global representativeness. Additionally, a tractable submodular objective is derived to enable efficient optimization, and theoretical proofs of its effectiveness are provided. This framework offers a novel approach for multi-modal large language models (MLLMs) in visual-language reasoning, particularly in addressing the issue of redundant visual tokens.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

OTPrune: Distribution-Aligned Visual Token Pruning via Optimal Transport

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误