A Framework for Robust Cognitive Evaluation of LLMs

Introduces CognitivEval, a robust evaluation framework that uses automatic prompt permutations and dual generation-probability testing to probe LLM cognition, replicating five classic cognitive-science experiments and profiling state-of-the-art models for standardized, reproducible research.