填空概率与语言模型下一个词预测之间的扩展关系

📄 中文摘要

研究表明,较大的语言模型在眼动和阅读时间数据的预测能力上表现更佳。尽管即使是最优秀的模型也未能充分分配概率质量给人类反应,但较大的模型在填空数据中对下一个词及其生成可能性的估计质量更高。这是因为它们对词汇共现统计的敏感性较低,同时在语义上更好地与人类填空反应对齐。结果支持了较大模型的更强记忆能力使其能够更准确地猜测语义上合适的词汇,但也使其对与词汇识别相关的低级信息的敏感性降低。

📄 English Summary

On the scaling relationship between cloze probabilities and language model next-token prediction

Recent research indicates that larger language models exhibit superior predictive capabilities regarding eye movement and reading time data. While even the best models tend to under-allocate probability mass to human responses, larger models provide higher-quality estimates of next tokens and their likelihood of production in cloze tasks. This improvement arises from their reduced sensitivity to lexical co-occurrence statistics and a better semantic alignment with human cloze responses. The findings support the notion that the enhanced memorization capacity of larger models enables them to guess more semantically appropriate words, albeit at the cost of decreased sensitivity to low-level information relevant for word recognition.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等