S3T-Former:一种纯粹由脉冲驱动的状态空间拓扑变换器用于骨骼动作识别

📄 中文摘要

骨骼基础的动作识别在多媒体应用中至关重要,但传统的人工神经网络(ANN)耗能较高,限制了其在资源受限的边缘设备上的部署。脉冲神经网络(SNN)提供了一种节能的替代方案,但现有的脉冲模型在处理骨骼数据时,常常通过密集矩阵聚合、重型多模态融合模块或非稀疏频域变换来妥协SNN的内在稀疏性。此外,这些模型还严重受限于脉冲神经元的短期遗忘问题。提出的脉冲状态空间拓扑变换器(S3T-Former)是首个完全由脉冲驱动的变换器,旨在克服上述挑战,提升骨骼动作识别的效率和准确性。

📄 English Summary

S3T-Former: A Purely Spike-Driven State-Space Topology Transformer for Skeleton Action Recognition

Skeleton-based action recognition is vital for multimedia applications, yet it heavily relies on power-hungry Artificial Neural Networks (ANNs), which limits deployment on resource-constrained edge devices. Spiking Neural Networks (SNNs) offer an energy-efficient alternative; however, existing spiking models for skeleton data often compromise the intrinsic sparsity of SNNs by employing dense matrix aggregations, heavy multimodal fusion modules, or non-sparse frequency domain transformations. Additionally, they suffer significantly from the short-term amnesia of spiking neurons. The proposed Spiking State-Space Topology Transformer (S3T-Former) is the first purely spike-driven Transformer designed to address these challenges, enhancing the efficiency and accuracy of skeleton action recognition.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等