Reset filters

Search publications


Search by keyword
List by department / centre / faculty

No publications found.

 

Divergent creativity in humans and large language models

Authors: Bellemare-Pepin ALespinasse FThölke PHarel YMathewson KOlson JABengio YJerbi K


Affiliations

1 CoCo Lab, Psychology department, Université de Montréal, Montreal, QC, Canada.
2 Music department, Concordia University, Montreal, QC, Canada.
3 Sociology and Anthropology department, Concordia University, Montreal, QC, Canada.
4 Mila - Quebec Artificial Intelligence Institute, Montreal, QC, Canada.
5 Department of Psychology, University of Toronto Mississauga, Mississauga, ON, Canada.
6 Department of Computer Science and Operations Research, Université de Montréal, Montreal, QC, Canada.
7 CoCo Lab, Psychology department, Université de Montréal, Montreal, QC, Canada. karim.jerbi@umontreal.ca.
8 Mila - Quebec Artificial Intelligence Institute, Montreal, QC, Canada. karim.jerbi@umontreal.ca.
9 UNIQUE Center (Quebec Neuro-AI research Center), Montréal, QC, Canada. karim.jerbi@umontreal.ca.

Description

The recent surge of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin to human capabilities. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLMs' semantic diversity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in computational creativity to analyze semantic divergence in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. These divergence-based measures index associative thinking-the ability to access and combine remote concepts in semantic space-an established facet of creative cognition. We benchmark performance on the Divergent Association Task (DAT) and across multiple creative-writing tasks (haiku, story synopses, and flash fiction), using identical, objective scoring. We found evidence that LLMs can surpass average human performance on the DAT, and approach human creative writing abilities, yet they remain below the mean creativity scores observed among the more creative segment of human participants. Notably, even the top performing LLMs are still largely surpassed by the aggregated top half of human participants, underscoring a ceiling that current LLMs still fail to surpass. We also systematically varied linguistic strategy prompts and temperature, observing reliable gains in semantic divergence for several models. Our human-machine benchmarking framework addresses the polemic surrounding the imminent replacement of human creative labor by AI, disentangling the quality of the respective creative linguistic outputs using established objective measures. While prompting deeper exploration of the distinctive elements of human inventive thought compared to those of AI systems, we lay out a series of techniques to improve their outputs with respect to semantic diversity, such as prompt design and hyper-parameter tuning.


Keywords: Artificial intelligenceCreativityLLMSemantics


Links

PubMed: https://pubmed.ncbi.nlm.nih.gov/41565675/

DOI: 10.1038/s41598-025-25157-3