Toggle light / dark theme

NVIDIA Brings Reasoning Models to Consumers Ranging from 1.5B to 32B Parameters

Today, NVIDIA unveiled OpenReasoning-Nemotron, a quartet of distilled reasoning models with 1.5B, 7B, 14B, and 32B parameters, all derived from the 671B-parameter DeepSeek R1 0528. By compressing that massive teacher into four leaner Qwen‑2.5-based students, NVIDIA is making advanced reasoning experiments accessible even on standard gaming rigs, without the need to worry about hefty GPU bills and cloud usage. The key is not some elaborate trick but raw data. Using the NeMo Skills pipeline, NVIDIA generated five million math, science, and code solutions, and then fine-tuned each one purely with supervised learning. Already, the 32B model hits an 89.2 on AIME24 and 73.8 on the HMMT February contest, while even the 1.5B variant manages a solid 55.5 and 31.5.

Nima Arkani-Hamed, Gopal Prasad Professor, School of Natural Sciences, Institute for Advanced Study

Beyond Space-Time and Quantum Mechanics.

Nima Arkani-Hamed.

(June 28, 2025)


A tribute to jim simons in celebration of the importance of basic science and mathematics.

Leaders in mathematics, science and philanthropy gathered on June 27, 2025, to remember the incredible contributions of Jim Simons and to inspire continued philanthropic support of basic research.

This Rope-Powered Robot Dog Built by a US Student Walks With Stunning Realism Thanks to a Brilliant Mathematical Design

IN A NUTSHELL 🐕 CARA is a robot dog created by a Purdue University student using innovative capstan drive technology. 🔧 The robot incorporates custom 3D-printed parts and high-strength materials like carbon fiber for durability and efficiency. 🤖 Advanced coding techniques such as Inverse Kinematics allow CARA to move with natural grace and agility. 🚀

Approach improves how new skills are taught to large language models

Researchers have developed a technique that significantly improves the performance of large language models without increasing the computational power necessary to fine-tune the models. The researchers demonstrated that their technique improves the performance of these models over previous techniques in tasks including commonsense reasoning, arithmetic reasoning, instruction following, code generation, and visual recognition.

Large language models are artificial intelligence systems that are pretrained on huge data sets. After pretraining, these models predict which words should follow each other in order to respond to user queries. However, the nonspecific nature of pretraining means that there is ample room for improvement with these models when the user queries are focused on specific topics, such as when a user requests the model to answer a math question or to write computer code.

“In order to improve a model’s ability to perform more specific tasks, you need to fine-tune the model,” says Tianfu Wu, co-corresponding author of a paper on the work and an associate professor of computer engineering at North Carolina State University.

What is the Church-Turing Thesis?

Modern-day computers have proved to be quite powerful in what they can do. The rise of AI has made things we previously only imagined possible. And the rate at which computers are increasing their computational power certainly makes it seem like we will be able to do almost anything with them. But as we’ve seen before, there are fundamental limits to what computers can do regardless of the processors or algorithms they use. This naturally leads us to ask what computers are capable of doing at their best and what their limits are. Which requires formalizing various definitions in computing.

This is exactly what happened in the early 20th century. Logicians & mathematicians were trying to formalize the foundations of mathematics through logic. One famous challenge based on this was the Entscheidungsproblem posed by David Hilbert and Wilhelm Ackermann. The problem asked if there exists an algorithm that can verify whether any mathematical statement is true or false based on provided axioms. Such an algorithm could be used to verify if any mathematical system is internally consistent. Kurt Gödel proved in 1931 that this problem could not be answered one way or the other through his incompleteness theorems.

Years later, Alan Turing and Alonzo Church proved the same through separate, independent means. Turing did so by developing Turing machines (called automatic machines at the time) and the Halting problem. Church did so by developing lambda calculus. Later on, it was proved that Turing machines and lambda calculus are mathematically equivalent. This led many mathematicians to theorize that computability could be defined by either of these systems. That in turn caused Turing and Church to make their thesis: every effectively calculable function is a computable function. In simpler terms, it states that any computation from any model can be carried out by a Turing machine or lambda calculus. To better understand the implications of the Church-Turing thesis, we need to explore the different kinds of computational machines.