Large Reasoning Models (LRMs) have rapidly advanced, exhibiting impressive performance in complex problem-solving tasks across domains like mathematics, coding, and scientific reasoning. However, current evaluation…
REST: A Stress-Testing Framework for Evaluating Multi-Problem Reasoning in Large Reasoning Models
