Deep Neural Nets: 33 years ago and 33 years from now#

来源: http://karpathy.github.io/2022/03/14/lecun1989/

摘要: 这篇文章回顾了1989年LeCun等人发表的手写邮编识别论文,该论文是最早将反向传播应用于实际问题的深度学习研究之一。作者使用PyTorch重现了该论文的实验,并对比了33年来深度学习的进展。原始模型在当时需要3天训练,现在在M1 MacBook上只需90秒。通过采用现代深度学习技术(如Adam优化器、数据增强、Dropout、ReLU激活函数等),成功将测试错误率降低了60%。文章还展望了未来33年(2055年)的AI发展:数据集和模型规模可能增长约1000万倍,目前的模型将可在个人设备上快速训练。更重要的是,未来可能不再需要从头训练神经网络,而是通过与超大规模基础模型交互来完成任务。这反映

关键词: 深度学习历史, 神经网络优化, 基础模型, 计算机视觉, 模型扩展


Deep Neural Nets: 33 years ago and 33 years from now#

This article reviews the seminal 1989 paper by LeCun et al. on handwritten zip code recognition, one of the earliest applications of backpropagation to real-world problems in deep learning. The author recreated the paper’s experiments using PyTorch and compared the progress of deep learning over 33 years. While the original model required 3 days of training in 1989, it now takes only 90 seconds on an M1 MacBook. By adopting modern deep learning techniques such as Adam optimizer, data augmentation, Dropout, and ReLU activation functions, the test error rate was successfully reduced by 60%. The article also envisions AI development 33 years into the future (2055): dataset and model sizes may grow approximately 10 million times larger, and current models will be quickly trainable on personal devices. More importantly, future AI systems may no longer require training neural networks from scratch, but rather accomplish tasks through interaction with massive-scale foundation models. This reflects the transformative journey of deep learning from its early days to its potential future.