创始效应塑造开放LLM家族中多模态的进化动态

出处: Founder effects shape the evolutionary dynamics of multimodality in open LLM families

发布: 2026年3月25日

📄 中文摘要

大型语言模型（LLM）家族正在迅速发展，但多模态能力的出现和传播速度尚不明确。利用Hugging Face模型元数据和记录的谱系字段构建的ModelBiome AI生态系统数据集（超过1.8百万个模型条目），对多模态能力在时间上的演变及其在记录的亲子关系中的分布进行了量化分析。在更广泛的生态系统中，跨模态任务在主要开放LLM家族普遍流行之前就已广泛存在：在这些家族中，多模态能力在2023年及2024年大部分时间仍然稀少，随后在2024至2025年间急剧增加，主要以图像-文本视觉语言任务为主。各大家族中，首个视觉语言模型（VLM）变体通常在这一时间段内首次出现。

🏷️ 相关标签

#多模态 #大型语言模型 #视觉语言模型 #演化动态 #生态系统

📄 English Summary

Founder effects shape the evolutionary dynamics of multimodality in open LLM families

Large language model (LLM) families are evolving rapidly, yet the emergence and propagation speed of multimodal capabilities remain unclear. Utilizing the ModelBiome AI Ecosystem dataset, which encompasses Hugging Face model metadata and lineage fields (over 1.8 million model entries), this study quantifies multimodality over time and along recorded parent-child relationships. Cross-modal tasks are prevalent in the broader ecosystem well before they become common within major open LLM families. Within these families, multimodality remains rare through 2023 and most of 2024, then sharply increases between 2024 and 2025, predominantly driven by image-text vision-language tasks. The first vision-language model (VLM) variants typically emerge during this period across major families.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

Founder effects shape the evolutionary dynamics of multimodality in open LLM families

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误