📄 中文摘要
互联网表情包已成为社交媒体上主要的表达形式,尤其是在孟加拉语社区中。尽管表情包通常具有幽默感,但也可能被用来传播针对个人和群体的冒犯性、有害和煽动性内容。由于其讽刺性、微妙性和文化特异性,这类内容的检测极具挑战性。对于孟加拉语等低资源语言而言,这一问题尤为突出,因为现有研究主要集中在高资源语言上。为填补这一关键研究空白,提出了Bn-HIB(孟加拉仇恨、煽动性和良性内容数据集),该数据集包含3,247个手动标注的孟加拉语表情包,分类为良性、仇恨或煽动性。Bn-HIB的数据集为相关研究提供了重要的基础。
📄 English Summary
Detecting Hate and Inflammatory Content in Bengali Memes: A New Multimodal Dataset and Co-Attention Framework
Internet memes have emerged as a prominent form of expression on social media, particularly within the Bengali-speaking community. While often humorous, these memes can also be utilized to disseminate offensive, harmful, and inflammatory content aimed at individuals and groups. Detecting such content poses significant challenges due to its satirical, subtle, and culturally specific nature. This issue is further exacerbated for low-resource languages like Bengali, as existing research predominantly focuses on high-resource languages. To address this critical research gap, Bn-HIB (Bangla Hate Inflammatory Benign) is introduced, a novel dataset comprising 3,247 manually annotated Bengali memes categorized as Benign, Hate, or Inflammatory. The Bn-HIB dataset serves as a crucial foundation for further research in this area.