剪枝式遗忘中的概念复兴风险:揭示扩散模型下的根源

📄 中文摘要

剪枝式遗忘作为一种快速、无训练且与数据无关的方法,近年来在扩散模型中被广泛应用于去除不必要的概念。该方法以其高效性和鲁棒性,成为传统微调或编辑式遗忘的有吸引力的替代方案。然而,研究发现这一前景光明的范式背后潜藏着隐患。剪枝过程中被置零的权重位置可能成为侧信道信号,泄露被删除概念的关键信息。为验证这一脆弱性,设计了一种新颖的攻击框架,能够在完全无数据和无训练的情况下,从剪枝后的扩散模型中复兴被删除的概念。实验结果确认了这一风险的存在。

📄 English Summary

Roots Beneath the Cut: Uncovering the Risk of Concept Revival in Pruning-Based Unlearning for Diffusion Models

Pruning-based unlearning has emerged as a fast, training-free, and data-independent method for removing unwanted concepts from diffusion models. This approach promises high efficiency and robustness, making it an attractive alternative to traditional fine-tuning or editing-based unlearning. However, a hidden danger has been uncovered within this promising paradigm. The locations of pruned weights, typically set to zero during the unlearning process, can act as side-channel signals that leak critical information about the erased concepts. To verify this vulnerability, a novel attack framework has been designed, capable of reviving erased concepts from pruned diffusion models in a fully data-free and training-free manner. Experimental results confirm the existence of this risk.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等