利用机器学习增强斯瓦希里语中模糊滥用语言的检测:聚焦儿童安全
📄 中文摘要
随着数字技术的迅猛发展,网络欺凌和在线虐待的潜在风险显著增加,尤其是在儿童群体中,亟需加强检测和预防措施。本研究针对斯瓦希里语中的模糊滥用语言进行检测,斯瓦希里语作为一种低资源语言,其独特的挑战在于语言资源和技术支持的有限性。选择斯瓦希里语的原因在于其在非洲的广泛使用,拥有超过1600万的母语使用者和超过1亿的总使用者,分布在东非及部分中东地区。研究中采用了多种机器学习模型,包括支持向量机(SVM)、逻辑回归和决策树等,以提高对模糊滥用语言的检测能力。
📄 English Summary
Using Machine Learning to Enhance the Detection of Obfuscated Abusive Words in Swahili: A Focus on Child Safety
The rise of digital technology has significantly increased the risk of cyberbullying and online abuse, particularly among children, necessitating enhanced detection and prevention measures. This study focuses on detecting obfuscated abusive language in Swahili, a low-resource language that presents unique challenges due to its limited linguistic resources and technological support. Swahili was selected due to its widespread use in Africa, with over 16 million native speakers and more than 100 million total speakers across East Africa and parts of the Middle East. Various machine learning models, including Support Vector Machines (SVM), Logistic Regression, and Decision Trees, were employed to improve the detection of obfuscated abusive language.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等