多源、多智能体证据检索用于事实核查

📄 中文摘要

随着互联网虚假信息的传播对社会和个人造成重大威胁,迫切需要一种稳健且可扩展的事实核查方法,该方法依赖于检索准确且可信的证据。以往的方法依赖于从训练数据中学习的语义和社会上下文模式,这限制了其对新数据分布的泛化能力。最近提出的基于检索增强生成(RAG)的方法利用了大语言模型(LLM)的推理能力,并结合检索到的证据文档。然而,这些方法在证据检索中主要依赖文本相似性,难以捕捉丰富文档内容中的多跳语义关系。这些局限性导致了对复杂证据的检索效果不佳。

📄 English Summary

Multi-Sourced, Multi-Agent Evidence Retrieval for Fact-Checking

The spread of misinformation on the Internet poses a significant threat to both societies and individuals, necessitating robust and scalable fact-checking methods that rely on retrieving accurate and trustworthy evidence. Previous approaches have relied on semantic and social-contextual patterns learned from training data, which limits their generalization to new data distributions. Recently proposed Retrieval Augmented Generation (RAG) methods leverage the reasoning capabilities of large language models (LLMs) with retrieved grounding evidence documents. However, these methods primarily rely on textual similarity for evidence retrieval and struggle to capture multi-hop semantic relations within rich document contents. These limitations result in suboptimal retrieval performance for complex evidence.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等