欺骗即教学？通过对抗强化学习锻造感知鲁棒性

出处: To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning

发布: 2026年2月27日

📄 中文摘要

尽管多模态大型语言模型（MLLMs）展现了令人印象深刻的能力，但在面对视觉复杂场景时却表现出感知脆弱性。这种弱点源于对有限训练数据集的依赖，这些数据集的扩展成本高昂，并对模型的鲁棒性施加了上限。提出了AOT-SFT，这是一个大规模对抗数据集，用于提升MLLM的鲁棒性。基于此，提出了AOT（对抗对手训练），这是一种自我对弈框架，通过生成自身的训练数据来锻造MLLM的鲁棒性。该方法协调了图像编辑攻击者与防御者MLLM之间的共同进化，攻击者生成多样化和动态的图像操作课程，迫使模型不断适应和提升。

🏷️ 相关标签

#多模态大型语言模型 #感知鲁棒性 #对抗数据集 #自我对弈 #图像编辑

📄 English Summary

To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning

Multimodal Large Language Models (MLLMs) exhibit perceptual fragility when faced with visually complex scenes, primarily due to their reliance on finite training datasets that are costly to scale and limit model robustness. AOT-SFT, a large-scale adversarial dataset, is introduced to bootstrap MLLM robustness. Building on this, AOT (Adversarial Opponent Training) is proposed as a self-play framework that enhances MLLM robustness by generating its own training data. This method facilitates a co-evolution between an image-editing Attacker and a Defender MLLM, where the Attacker creates a diverse and dynamic curriculum of image manipulations, compelling the model to continuously adapt and improve.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误