构建更公平的反垃圾邮件系统:我们如何处理链接、警告和新聊天

📄 中文摘要

最近对我们的Telegram反垃圾邮件机器人ModerAI进行了三项重大更改,根本改变了我们处理边缘案例的方式。传统的反垃圾邮件机器人通常做出二元决策:要么判定为垃圾邮件,要么判定为非垃圾邮件。这种方法导致了两个失败模式:一是误判,导致合法用户因在个人简介中包含链接而被封禁;二是漏判,垃圾邮件发送者能够学习规则并规避检测。为了解决这些问题,实施了上下文生物链接分析,通过分析用户简介中的链接上下文,判断其类别,从而更准确地决定是否封禁用户。

📄 English Summary

Building a Fairer Anti-Spam System: How We Handle Links, Warnings, and New Chats

Three significant changes were recently made to our Telegram anti-spam bot, ModerAI, fundamentally altering how edge cases are handled. Most anti-spam bots operate on binary decisions: spam or not spam. This creates two failure modes: false positives, where legitimate users are banned for having links in their bios, and false negatives, where spammers learn the rules and circumvent them. To address these issues, a contextual bio link analysis was implemented. By analyzing the context of links in user bios, the system can more accurately determine whether to ban a user based on the category of the link.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等