基于关系图的差分去噪和扩散注意力融合的多模态对话情感识别

出处: Relational graph-driven differential denoising and diffusion attention fusion for multimodal conversation emotion recognition

发布: 2026年3月30日

📄 中文摘要

在实际场景中，音频和视频信号常常受到环境噪声和有限采集条件的影响，导致提取的特征中包含过多噪声。此外，不同模态之间的数据质量和信息承载能力存在不平衡。这两个问题共同导致信息失真和权重偏差，损害整体识别性能。现有大多数方法忽视了噪声模态的影响，依赖隐式加权来建模模态的重要性，未能明确考虑文本模态在情感理解中的主导贡献。为了解决这些问题，提出了一种关系感知的去噪和扩散注意力融合方法，旨在提高多模态对话情感识别的准确性。

🏷️ 相关标签

#多模态 #情感识别 #去噪 #注意力融合 #关系图

📄 English Summary

Relational graph-driven differential denoising and diffusion attention fusion for multimodal conversation emotion recognition

In real-world scenarios, audio and video signals are often affected by environmental noise and limited acquisition conditions, resulting in excessive noise in the extracted features. Additionally, there is an imbalance in data quality and information carrying capacity among different modalities. These issues lead to information distortion and weight bias, impairing overall recognition performance. Most existing methods neglect the impact of noisy modalities and rely on implicit weighting to model modality importance, failing to explicitly account for the predominant contribution of the textual modality in emotion understanding. To address these challenges, a relation-aware denoising and diffusion attention fusion method is proposed to enhance the accuracy of multimodal conversation emotion recognition.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

Relational graph-driven differential denoising and diffusion attention fusion for multimodal conversation emotion recognition

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误