📄 中文摘要
深度神经网络在部署时面临计算资源限制,尤其是在边缘设备上。现有方法通常采用静态剪枝或动态退出策略,但这些方法在计算分配的粒度上存在局限性,未能充分利用模型内部的类别结构信息。本文提出了一种新颖的自适应测试时计算分配框架,通过学习启发式方法来动态调整计算量。该框架的核心思想是利用模型中间层的类别结构信息,例如不同类别在特征空间中的可分离性,来指导计算资源的分配。具体而言,我们设计了一种轻量级策略网络,该网络以中间特征和类别结构度量为输入,预测后续层是否需要执行。通过强化学习或端到端训练,策略网络能够学习到在保持性能的同时最小化计算成本的决策规则。实验结果表明,该方法在多个基准数据集和模型上取得了显著的计算效率提升,同时保持了与全模
📄 English Summary
Adaptive Test-Time Compute Allocation via Learned Heuristics over Categorical Structure
Deep neural networks often face significant computational constraints during deployment, particularly on edge devices. Existing approaches, such as static pruning or dynamic early exit strategies, typically suffer from limitations in the granularity of compute allocation and fail to fully leverage the rich categorical structure information embedded within the model. This paper introduces a novel adaptive test-time compute allocation framework that dynamically adjusts computational effort by learning heuristic policies. The core idea is to exploit the categorical structure information present in intermediate layers of the model, such as the separability of different classes in the feature space, to guide resource allocation. Specifically, we design a lightweight policy network that takes intermediate features and categorical structure metrics as input, predicting whether subsequent layers need to be executed. Through reinforcement learning or end-to-end training, this policy network learns decision rules that minimize computational cost while maintaining performance. Experimental results demonstrate that our method achieves substantial computational efficiency improvements across various benchmark datasets and models, while preserving accuracy comparable to the full model. It outperforms traditional static pruning and dynamic early exit methods. This framework offers a new paradigm for deploying deep learning models in resource-constrained environments, holding broad application prospects.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等