📄 中文摘要
本文从图论视角,为神经网络(NN)数据流问题提供了一种新颖的分析方法。该问题旨在识别对模型整体性能至关重要的神经网络连接。理解神经网络数据流是进行符号神经网络分析的有效工具,例如鲁棒性分析或模型修复。与传统基于信息论的神经网络数据流分析方法不同,本文采用微分几何中的图概念来描述和量化神经网络中的信息传输路径和强度。通过将神经网络结构视为一个有向图,其中节点代表神经元,边代表连接权重,可以利用微分几何中的黎曼流形、测地线和曲率等概念来表征信息在网络中的传播特性。具体而言,可以将神经元的激活值视为流形上的坐标,连接权重决定了流形上的度量张量。信息流的强度和方向可以通过沿着测地线的信息变化率来衡量,而网络的局部曲率则可以反映信息传输的非线性程度和复杂性。这种方法允许我们超越简单的连接强度评估,深入探讨信息在不同层级和模块之间的相互作用,从而识别出关键的信息瓶颈或冗余路径。例如,通过计算信息流的散度和旋度,可以发现信息汇聚或发散的区域,进而揭示网络中潜在的功能模块。此外,这种微分几何框架能够为神经网络的剪枝、压缩和结构优化提供新的理论依据,通过识别并移除对信息流贡献较小的连接或神经元,在保持模型性能的同时降低其复杂性。最终,这种基于微分几何的分析方法为理解神经网络内部工作机制、提高模型可解释性和设计更高效的神经网络架构开辟了新的途径。
📄 English Summary
Analyzing Neural Network Information Flow Using Differential Geometry
This paper offers a novel perspective on the neural network (NN) data flow problem, which focuses on identifying the NN connections most crucial for the full model's performance, by leveraging graph theory. Understanding NN data flow provides a powerful tool for symbolic NN analysis, including robustness analysis and model repair. Diverging from standard information-theoretic approaches to NN data flow analysis, this work employs concepts from differential geometry to characterize and quantify information transmission paths and strengths within neural networks. By modeling the neural network structure as a directed graph where nodes represent neurons and edges represent connection weights, differential geometric notions such as Riemannian manifolds, geodesics, and curvature can be utilized to describe information propagation properties within the network. Specifically, neuron activations can be viewed as coordinates on a manifold, and connection weights define the metric tensor on this manifold. The intensity and direction of information flow can be measured by the rate of information change along geodesics, while the local curvature of the network reflects the nonlinearity and complexity of information transmission. This approach allows for a deeper exploration of information interactions between different layers and modules, moving beyond simple connection strength evaluations, thereby identifying critical information bottlenecks or redundant paths. For instance, by calculating the divergence and curl of the information flow, regions of information convergence or divergence can be discovered, revealing potential functional modules within the network. Furthermore, this differential geometric framework can provide a new theoretical foundation for NN pruning, compression, and structural optimization. By identifying and removing connections or neurons that contribute minimally to information flow, network complexity can be reduced while maintaining model performance. Ultimately, this differential geometry-based analysis method opens new avenues for understanding NN internal mechanisms, enhancing model interpretability, and designing more efficient neural network architectures.