TwinWeaver:基于LLM的泛癌数字孪生基础模型框架

📄 中文摘要

TwinWeaver是一个开源框架,旨在解决精准肿瘤学中预测临床事件和患者轨迹的挑战,尤其是在处理稀疏、多模态临床时间序列数据方面。该框架的核心创新在于将纵向患者历史序列化为文本格式,从而能够利用大型语言模型(LLM)进行统一的事件预测和预后分析。通过这种文本化处理,即使是复杂的、异构的临床数据也能被LLM有效理解和处理。基于TwinWeaver,研究团队构建了Genie数字孪生(GDT)系统,该系统利用了来自20种癌症类型的93,054名患者数据。GDT的建立展示了TwinWeaver在整合大规模、多中心临床数据方面的能力,并为泛癌种预测模型提供了基础。

📄 English Summary

TwinWeaver: An LLM-Based Foundation Model Framework for Pan-Cancer Digital Twins

TwinWeaver is an open-source framework designed to address the critical challenge of forecasting clinical events and trajectories in precision oncology, particularly when dealing with sparse, multi-modal clinical time series data. The core innovation of this framework lies in serializing longitudinal patient histories into a textual format, thereby enabling the unified event prediction and prognostic forecasting capabilities of large language models (LLMs). This text-based processing allows LLMs to effectively understand and process even complex, heterogeneous clinical data. Leveraging TwinWeaver, the research team constructed the Genie Digital Twin (GDT) system, utilizing data from 93,054 patients across 20 cancer types. The development of GDT demonstrates TwinWeaver's capacity for integrating large-scale, multi-center clinical data and provides a foundational basis for pan-cancer predictive models. By converting heterogeneous clinical data into a unified textual representation, the framework overcomes the limitations of traditional machine learning approaches in handling multi-modal and time-series data, enabling LLMs to learn patterns of disease progression, treatment responses, and prognostic factors. The advent of TwinWeaver opens new avenues for developing more accurate and personalized cancer prediction models, promising to elevate the standard of precision medicine and assist clinicians in making more informed treatment decisions. Its open-source nature also encourages further community development and application, promoting the widespread adoption and innovation of digital twin technology in oncology.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等