基于全序列与子序列共形预测的命名实体识别不确定性量化

📄 中文摘要

命名实体识别(NER)在众多自然语言处理(NLP)流程中扮演着基础角色。然而,现有NER模型通常仅输出单一的预测标签序列,缺乏对预测不确定性的量化,这使得下游应用极易受到级联错误的影响。为解决此问题,本框架旨在使基于序列标注的NER模型能够生成具有不确定性意识的预测集合。通过引入全序列共形预测,模型能够为整个输出序列提供一个置信区间,确保在给定置信水平下,真实标签序列位于预测集合内。进一步地,为应对NER任务中实体边界和类型识别的复杂性,框架整合了子序列共形预测。这种方法允许模型对序列中的每个潜在实体或子序列提供独立的置信度,从而更精细地捕捉局部不确定性。具体而言,针对每个输入句子,模型首先利用预训练的NER模型生成初步的标签序列,随后通过共形预测校准技术,计算每个候选标签序列或子序列的非一致性分数。这些分数经过排序和阈值处理后,构建出包含真实标签序列的最小预测集,其覆盖率满足预设的置信水平。这种双重共形预测机制不仅提升了NER模型的可靠性,还为下游决策提供了关键的不确定性信息,例如在医疗文本分析或金融风险评估中,明确指出模型对特定实体识别的信心程度,从而避免因高置信度错误预测导致的严重后果。该方法理论上严谨,且在实际应用中能够有效平衡预测精度与不确定性量化的需求。

📄 English Summary

Uncertainty Quantification for Named Entity Recognition via Full-Sequence and Subsequence Conformal Prediction

Named Entity Recognition (NER) is a fundamental component in numerous natural language processing (NLP) pipelines. Nevertheless, current NER models typically output a single predicted label sequence without any accompanying measure of uncertainty, rendering downstream applications vulnerable to cascading errors. To address this limitation, a general framework is introduced for adapting sequence-labeling-based NER models to produce uncertainty-aware prediction sets. By incorporating full-sequence conformal prediction, the model can provide a confidence interval for the entire output sequence, ensuring that the true label sequence lies within the predicted set at a given confidence level. Furthermore, to address the complexities of entity boundary and type identification in NER tasks, the framework integrates subsequence conformal prediction. This approach allows the model to provide independent confidence scores for each potential entity or subsequence within the sequence, thereby capturing local uncertainties with greater granularity. Specifically, for each input sentence, the model first generates an initial label sequence using a pre-trained NER model. Subsequently, through conformal prediction calibration techniques, nonconformity scores are computed for each candidate label sequence or subsequence. These scores are then sorted and thresholded to construct a minimal prediction set that contains the true label sequence, with its coverage satisfying a predefined confidence level. This dual conformal prediction mechanism not only enhances the reliability of NER models but also provides crucial uncertainty information for downstream decision-making. For instance, in medical text analysis or financial risk assessment, it explicitly indicates the model's confidence in identifying specific entities, thereby preventing severe consequences arising from high-confidence erroneous predictions. The method is theoretically sound and effectively balances the demands of prediction accuracy with uncertainty quantification in practical applications.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等