Introduces CognitivEval, a robust evaluation framework that uses automatic prompt permutations and dual generation-probability testing to probe LLM cognition, replicating five classic cognitive-science experiments and profiling state-of-the-art models for standardized, reproducible research.

Updated: