Mathematical Reasoning in Large Language Models: Benchmarks, Architectures, Evaluation, and Open Challenges
This article reviews recent advancements in mathematical reasoning within Large Language Models (LLMs). It analyzes datasets, architectures, and evaluation methods across approximately 120 studies, highlighting challenges and opportunities for improvement in the robustness and reliability of LLM's reasoning capabilities. The findings aim to guide future research directions in this field.