BioBridge：连接蛋白质与语言以增强生物推理能力的框架

出处: BioBridge: Bridging Proteins and Language for Enhanced Biological Reasoning with LLMs

发布: 2026年2月23日

📄 中文摘要

现有的蛋白质语言模型（PLMs）在多任务适应性方面常常受到限制，并且在多样的生物学背景下表现出较差的泛化能力。相比之下，通用的大型语言模型（LLMs）缺乏对蛋白质序列的解释能力，并且在领域特定知识方面存在不足，限制了其进行有效生物语义推理的能力。为此，提出了BioBridge，一个用于蛋白质理解的领域自适应持续预训练框架。该框架采用领域增量持续预训练（DICP）方法，将蛋白质领域知识与通用推理语料库同时注入LLM，有效缓解了灾难性遗忘问题。跨模态对齐机制进一步增强了模型的表现。

🏷️ 相关标签

#蛋白质语言模型 #生物推理 #持续预训练 #领域自适应 #跨模态对齐

📄 English Summary

BioBridge: Bridging Proteins and Language for Enhanced Biological Reasoning with LLMs

Existing Protein Language Models (PLMs) often exhibit limited adaptability to multiple tasks and poor generalization across diverse biological contexts. In contrast, general-purpose Large Language Models (LLMs) lack the ability to interpret protein sequences and fall short in domain-specific knowledge, which limits their capacity for effective biosemantic reasoning. To address these issues, BioBridge is proposed as a domain-adaptive continual pretraining framework for protein understanding. This framework employs Domain-Incremental Continual Pre-training (DICP) to simultaneously infuse protein domain knowledge and general reasoning corpus into an LLM, effectively mitigating catastrophic forgetting. Additionally, a cross-modal alignment mechanism enhances the model's performance.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

BioBridge: Bridging Proteins and Language for Enhanced Biological Reasoning with LLMs

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误