Large Language Models Math Reasoning

MathEval: a comprehensive benchmark for evaluating large language models on mathematical reasoning capabilities

This study introduces MathEval, a comprehensive benchmarking framework designed to systematically evaluate the mathematical reasoning capabilities of large language models (LLMs). Addressing key ...

16d

New memory structure helps AI models think longer and faster without using more power

Researchers from the University of Edinburgh and NVIDIA have introduced a new method that helps large language models reason ...

Dagens.com on MSN

Even the best AI models can’t reliably do simple math

A new study digs into why modern AI models stumble over multi-digit multiplication and what kind of training finally makes ...

VentureBeat

How test-time scaling unlocks hidden reasoning abilities in small language models (and allows them to outperform LLMs)

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Very small language models (SLMs) can ...

Computerworld

Microsoft introduces Phi-4, an AI model for advanced reasoning tasks

Microsoft has announced Phi-4 — a new AI model with 14 billion parameters — designed for complex reasoning tasks, including mathematics. Phi-4 excels in areas such as STEM question-answering and ...

ExtremeTech

Microsoft's Phi-4-Reasoning Models Bring AI Math and Logic Skills to Smaller Devices

Microsoft has introduced a new set of small language models called Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning, which are described as "marking a new era for efficient AI." These ...

One of the world's biggest mathematicians Joel David Hamkins says AI models are basically zero help for mathematics as they produce…

Joel David Hamkins, a leading mathematician and logic professor at the University of Notre Dame, has fired a withering salvo ...

13d

Hide inaccessible results