📄 中文摘要
现代语言模型能够轻松提取非结构化文本中的敏感信息,因此对数据安全而言,信息的选择性删除(即编辑)显得尤为重要。然而,现有的编辑基准通常集中于预定义的数据类别,如个人身份信息(PII),或评估特定的技术,如掩蔽。为了解决这一局限性,RedacBench被提出,作为一个全面的基准,用于评估跨领域和策略的政策条件编辑。该基准由514篇人类撰写的文本构成,涵盖个人、企业和政府来源,并配有187项安全政策。RedacBench评估模型选择性删除违反政策的信息的能力。
📄 English Summary
RedacBench: Can AI Erase Your Secrets?
Modern language models can easily extract sensitive information from unstructured text, making redaction—selective removal of such information—crucial for data security. However, existing benchmarks for redaction often focus on predefined categories of data, such as personally identifiable information (PII), or evaluate specific techniques like masking. To address this limitation, RedacBench is introduced as a comprehensive benchmark for evaluating policy-conditioned redaction across various domains and strategies. Constructed from 514 human-authored texts spanning individual, corporate, and government sources, paired with 187 security policies, RedacBench measures a model's ability to selectively remove information that violates these policies.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等