欺骗即教学?通过对抗强化学习锻造感知鲁棒性

📄 中文摘要

尽管多模态大型语言模型(MLLMs)展现了令人印象深刻的能力,但在面对视觉复杂场景时却表现出感知脆弱性。这种弱点源于对有限训练数据集的依赖,这些数据集的扩展成本高昂,并对模型的鲁棒性施加了上限。提出了AOT-SFT,这是一个大规模对抗数据集,用于提升MLLM的鲁棒性。基于此,提出了AOT(对抗对手训练),这是一种自我对弈框架,通过生成自身的训练数据来锻造MLLM的鲁棒性。该方法协调了图像编辑攻击者与防御者MLLM之间的共同进化,攻击者生成多样化和动态的图像操作课程,迫使模型不断适应和提升。

📄 English Summary

To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning

Multimodal Large Language Models (MLLMs) exhibit perceptual fragility when faced with visually complex scenes, primarily due to their reliance on finite training datasets that are costly to scale and limit model robustness. AOT-SFT, a large-scale adversarial dataset, is introduced to bootstrap MLLM robustness. Building on this, AOT (Adversarial Opponent Training) is proposed as a self-play framework that enhances MLLM robustness by generating its own training data. This method facilitates a co-evolution between an image-editing Attacker and a Defender MLLM, where the Attacker creates a diverse and dynamic curriculum of image manipulations, compelling the model to continuously adapt and improve.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等