📄 中文摘要
大型语言模型(LLM)在数学推理任务中的性能提升,严重依赖于高质量且难度分级明确的训练数据。然而,现有数据合成方法普遍存在多样性不足和对问题难度控制不精确的问题,这使得它们难以有效支持如课程学习等高效训练范式。针对这些挑战,我们提出了MathMixup,一种创新的数据合成框架,旨在生成具有高度多样性和精确难度控制的数学推理训练数据。MathMixup通过引入一系列可配置的难度参数,允许研究人员和开发者根据特定需求定制数据的复杂性。这些参数可以涵盖问题的结构复杂性、涉及的数学概念数量、所需的推理步骤深度以及数值范围等多个维度。该框架利用生成式对抗网络(GANs)或基于规则的模板系统,结合符号推理引擎,确保生成的问题不仅在语法上正确,而且在语义上具有挑战性。通过动态调整这些难度参数,MathMixup能够生成一个从简单到复杂平滑过渡的数据流,这对于实施课程学习至关重要。在课程学习范式中,模型首先从相对简单的数学问题中学习基础知识和推理模式,然后逐步接触更复杂的题目,从而有效避免了模型在早期训练阶段因面对过于困难的问题而陷入局部最优或训练不稳定的问题。实验结果表明,与仅使用静态或随机生成数据的基线方法相比,采用MathMixup生成的数据进行训练的LLM在多种数学推理基准测试中展现出显著的性能提升,尤其是在处理高难度问题时表现出更强的泛化能力和鲁棒性。这为构建更高效、更具鲁棒性的数学推理LLM提供了新的途径。
📄 English Summary
MathMixup: Boosting LLM Mathematical Reasoning with Difficulty-Controllable Data Synthesis and Curriculum Learning
The advancement of Large Language Models (LLMs) in mathematical reasoning tasks heavily relies on high-quality training data with clearly defined and well-graded difficulty levels. However, existing data synthesis methods often suffer from limited diversity and lack precise control over problem difficulty, rendering them insufficient for supporting efficient training paradigms such as curriculum learning. To address these challenges, we propose MathMixup, an innovative data synthesis framework designed to generate highly diverse mathematical reasoning training data with fine-grained difficulty control. MathMixup introduces a set of configurable difficulty parameters that allow researchers and developers to customize data complexity according to specific needs. These parameters can encompass various dimensions, including structural complexity of problems, number of mathematical concepts involved, depth of reasoning steps required, and numerical ranges. The framework leverages generative adversarial networks (GANs) or rule-based template systems, combined with symbolic reasoning engines, to ensure that generated problems are not only syntactically correct but also semantically challenging. By dynamically adjusting these difficulty parameters, MathMixup can produce a smooth transition of data from simple to complex, which is crucial for implementing curriculum learning. In a curriculum learning paradigm, the model first learns fundamental knowledge and reasoning patterns from relatively simple mathematical problems, then gradually progresses to more complex problems, thereby effectively preventing the model from getting stuck in local optima or experiencing training instability due to overly difficult problems in early training phases. Experimental results demonstrate that LLMs trained with data generated by MathMixup show significant performance improvements across various mathematical reasoning benchmarks compared to baseline methods using only static or randomly generated data, especially exhibiting stronger generalization and robustness when tackling high-difficulty problems. This offers a new avenue for building more efficient and robust mathematical reasoning LLMs.