本体引导的神经符号推理:用数学领域知识为语言模型奠基
📄 中文摘要
语言模型在高风险专业领域中存在基本局限性,如幻觉、脆弱性和缺乏形式基础,这些问题在需要可验证推理的情况下尤为突出。研究表明,正式的领域本体能够通过增强检索生成来提高语言模型的可靠性。以数学为例,构建了一个神经符号管道,利用OpenMath本体结合混合检索和交叉编码重排序,将相关定义注入模型提示中。在MATH基准测试中对三种开源模型进行评估,结果显示,当检索质量高时,本体引导的上下文能够提升性能,但不相关的上下文则会显著降低性能。
📄 English Summary
Ontology-Guided Neuro-Symbolic Inference: Grounding Language Models with Mathematical Domain Knowledge
Language models exhibit fundamental limitations such as hallucination, brittleness, and lack of formal grounding, which are particularly problematic in high-stakes specialist fields that require verifiable reasoning. This research investigates whether formal domain ontologies can enhance the reliability of language models through retrieval-augmented generation. Using mathematics as a proof of concept, a neuro-symbolic pipeline is implemented that leverages the OpenMath ontology with hybrid retrieval and cross-encoder reranking to inject relevant definitions into model prompts. Evaluation on the MATH benchmark with three open-source models reveals that ontology-guided context improves performance when retrieval quality is high, while irrelevant context actively degrades it.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等