BERT NER模型在临床实体提取中的噪声减少

📄 中文摘要

在临床笔记和报告中,临床实体提取的精确度至关重要。经过临床数据预训练的BERT模型经过微调后用于命名实体识别(NER),在召回率方面表现良好,但在临床模型所需的高精度范围内未能达到预期。为了解决这一挑战,开发了一种噪声去除模型,该模型对NER的输出进行精炼。NER模型为每个标记分配实体标签和概率分数,而噪声去除模型则分析这些概率序列,并将预测结果分类为弱或强。这一方法旨在提高临床实体提取的精确度,从而更好地满足临床应用的需求。

📄 English Summary

Noise reduction in BERT NER models for clinical entity extraction

Precision is critical in clinical entity extraction from clinical notes and reports. A BERT model pre-trained on clinical data and fine-tuned for Named Entity Recognition (NER) demonstrated good recall but fell short in achieving the high precision required for clinical applications. To tackle this issue, a Noise Removal model was developed to refine the output of the NER. The NER model assigns token-level entity tags and probability scores for each token, while the Noise Removal model analyzes these probability sequences to classify predictions as either weak or strong. This approach aims to enhance the precision of clinical entity extraction, thereby better meeting the needs of clinical applications.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等