基于人工智能的端到端稀有疾病表型提取框架:利用大型语言模型从临床笔记中提取信息

📄 中文摘要

稀有疾病的表型提取对诊断至关重要,但从临床笔记中手动提取结构化表型既费时又难以扩展。现有的人工智能方法通常仅优化表型提取的单个组件,而未能全面实现从临床文本中提取特征、将其标准化为人类表型本体(HPO)术语,并优先考虑具有诊断信息的HPO术语的完整临床工作流程。RARE-PHENIX是一个端到端的人工智能框架,集成了基于大型语言模型的表型提取、基于本体的HPO术语标准化以及对诊断信息表型的监督排名。RARE-PHENIX的开发旨在提高稀有疾病的表型提取效率和准确性。

📄 English Summary

An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes using large language models

Phenotyping is essential for the diagnosis of rare diseases, yet the manual curation of structured phenotypes from clinical notes is labor-intensive and challenging to scale. Existing artificial intelligence approaches typically focus on optimizing individual components of phenotyping, failing to operationalize the complete clinical workflow that involves extracting features from clinical text, standardizing them to Human Phenotype Ontology (HPO) terms, and prioritizing diagnostically informative HPO terms. RARE-PHENIX is developed as an end-to-end AI framework for rare disease phenotyping, integrating large language model-based phenotype extraction, ontology-grounded standardization to HPO terms, and supervised ranking of diagnostically informative phenotypes. The development of RARE-PHENIX aims to enhance the efficiency and accuracy of phenotype extraction for rare diseases.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等