检测孟加拉语表情包中的仇恨和煽动性内容：一个新的多模态数据集和共同注意力框架

出处: Detecting Hate and Inflammatory Content in Bengali Memes: A New Multimodal Dataset and Co-Attention Framework

发布: 2026年2月27日

📄 中文摘要

互联网表情包已成为社交媒体上主要的表达形式，尤其是在孟加拉语社区中。尽管表情包通常具有幽默感，但也可能被用来传播针对个人和群体的冒犯性、有害和煽动性内容。由于其讽刺性、微妙性和文化特异性，这类内容的检测极具挑战性。对于孟加拉语等低资源语言而言，这一问题尤为突出，因为现有研究主要集中在高资源语言上。为填补这一关键研究空白，提出了Bn-HIB（孟加拉仇恨、煽动性和良性内容数据集），该数据集包含3,247个手动标注的孟加拉语表情包，分类为良性、仇恨或煽动性。Bn-HIB的数据集为相关研究提供了重要的基础。

🏷️ 相关标签

#孟加拉语 #表情包 #仇恨内容 #煽动性内容 #数据集

📄 English Summary

Detecting Hate and Inflammatory Content in Bengali Memes: A New Multimodal Dataset and Co-Attention Framework

Internet memes have emerged as a prominent form of expression on social media, particularly within the Bengali-speaking community. While often humorous, these memes can also be utilized to disseminate offensive, harmful, and inflammatory content aimed at individuals and groups. Detecting such content poses significant challenges due to its satirical, subtle, and culturally specific nature. This issue is further exacerbated for low-resource languages like Bengali, as existing research predominantly focuses on high-resource languages. To address this critical research gap, Bn-HIB (Bangla Hate Inflammatory Benign) is introduced, a novel dataset comprising 3,247 manually annotated Bengali memes categorized as Benign, Hate, or Inflammatory. The Bn-HIB dataset serves as a crucial foundation for further research in this area.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

Detecting Hate and Inflammatory Content in Bengali Memes: A New Multimodal Dataset and Co-Attention Framework

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误