This study introduces MathEval, a comprehensive benchmarking framework designed to systematically evaluate the mathematical reasoning capabilities of large language models (LLMs). Addressing key ...
A new study digs into why modern AI models stumble over multi-digit multiplication and what kind of training finally makes ...
Researchers from the University of Edinburgh and NVIDIA have introduced a new method that helps large language models reason ...
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Very small language models (SLMs) can ...
Joel David Hamkins, a leading mathematician and logic professor at the University of Notre Dame, has fired a withering salvo ...
Microsoft has announced Phi-4 — a new AI model with 14 billion parameters — designed for complex reasoning tasks, including mathematics. Phi-4 excels in areas such as STEM question-answering and ...
Microsoft has introduced a new set of small language models called Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning, which are described as "marking a new era for efficient AI." These ...
One of the world's biggest mathematicians Joel David Hamkins has slammed AI models used for solving mathematics and called ...
A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...
In 2025, large language models moved beyond benchmarks to efficiency, reliability, and integration, reshaping how AI is ...