Insights, research updates, and analysis from GAUSS Team.
Comprehensive overview of the GAUSS benchmark framework, including our multidimensional evaluation approach and cognitive skill assessment methodology.
In-depth analysis of how human evaluators and LLM judges align when assessing mathematical problem-solving capabilities across different skill dimensions.