📄 中文摘要
定理证明是程序验证的核心技术,其中验证条件(VCs)的自动化证明仍然是主要瓶颈。在真实世界的程序验证场景中,经常遇到现有自动化定理证明器(ATPs)无法证明的困难VCs,这导致了大量的关键性手动证明工作,严重阻碍了实际应用。尽管神经定理证明(NTP)在数学竞赛等领域取得了显著成功,但在程序验证这一特定且复杂的领域中,其潜力尚未得到充分挖掘。当前NTP方法主要集中在相对简单的数学问题或形式化证明的早期阶段,缺乏针对程序验证中复杂VCs的有效策略。这些VCs通常包含复杂的量化、归纳推理、数据结构不变量以及特定领域的理论知识,需要深入的逻辑推理和高效的搜索策略。现有的NTP模型往往难以处理这些复杂性,其证明能力受限于训练数据的规模和多样性,以及模型架构对复杂逻辑关系的捕捉能力。因此,开发能够有效处理真实世界程序验证中困难VCs的NTP方法,并建立相应的基准测试数据集,对于推动程序验证的自动化和智能化至关重要。这不仅需要NTP模型具备更强的逻辑推理能力和领域知识整合能力,还需要探索新的训练范式和证明搜索策略,以克服现有ATPs的局限性,减少对人工证明的依赖,从而加速软件开发和保障软件质量。
📄 English Summary
Neural Theorem Proving for Verification Conditions: A Real-World Benchmark
Theorem proving is fundamental to program verification, where the automated proof of Verification Conditions (VCs) remains a primary bottleneck. Real-world program verification frequently encounters hard VCs that existing Automated Theorem Provers (ATPs) cannot prove, leading to a critical need for extensive manual proofs that burden practical application. While Neural Theorem Proving (NTP) has achieved significant success in mathematical competitions and other domains, its potential in the specific and complex field of program verification has not been fully explored. Current NTP methods primarily focus on relatively simple mathematical problems or early stages of formal proofs, lacking effective strategies for the complex VCs encountered in program verification. These VCs often involve intricate quantification, inductive reasoning, data structure invariants, and domain-specific theoretical knowledge, requiring deep logical inference and efficient search strategies. Existing NTP models often struggle with these complexities, with their proof capabilities limited by the scale and diversity of training data, as well as the model architecture's ability to capture complex logical relationships. Therefore, developing NTP methods capable of effectively handling difficult VCs in real-world program verification, and establishing corresponding benchmark datasets, is crucial for advancing the automation and intelligence of program verification. This necessitates not only stronger logical reasoning capabilities and domain knowledge integration in NTP models but also the exploration of new training paradigms and proof search strategies to overcome the limitations of existing ATPs, reduce reliance on manual proofs, and thereby accelerate software development and ensure software quality.