Is that AI platform more creative than you? UTM postdoc helps dig up answers

Scrabble tiles on a table, with the word "AI" spelled out in the centre.

It’s a simple, quick game: name 10 words that are vastly different from each other. 

Those who are highly creative choose words with a greater difference – like galaxy, velvet, hurricane – while those with average creativity pick words that are more closely associated with one another –like cat, dog, hamster.

The exercise is the basis of the Divergent Association Task – a tool that measures a person’s ability to generate many different ideas, also known as divergent thinking. It was developed by Jay Olson, a postdoctoral fellow in the department of psychological and brain sciences at the University of Toronto Mississauga. 

Recently, Olson and his colleagues at the Universite de Montreal used the exercise to compare creativity between large language models, like ChatGPT, and human participants. When put to the tool’s test, the researchers found that LLMs’ creativity surpassed that of the average person. 

But when compared to highly creative people, LLMs fell short. 

“These companies all make claims about how this new model is more creative than the last one, or we have the most creative model, but there’s no robust metric for assessing that,” says Olson. “We thought this task might be one that could be used (to measure LLMs’ creativity).”

The exercise was used in recently published research – titled Divergent Creativity in Humans and Large Language Models – that was the largest study to-date comparing the creativity of humans and LLMs. 

The Divergent Association Task was the foundation of the study because previous research found that performance on the exercise correlates with performance on standard creativity tasks, like writing and problem solving. 

As part of the study, researchers repeatedly asked each of the LLMs, such as ChatGPT4 and Gemini, to complete the task and then compared the results with samples from 100,000 participants. 

The researchers quantified the “semantic distance” between the words to determine the LLMs’ and participants’ creativity level.

“Words like cat and dog are very close to each other, so the distance would be smaller – whereas cat and thimble would be further apart,” says Olson. “All the task is doing is taking the average semantic distance of the named words.”

He adds that the human and AI platforms were given the same instructions, and the research team computed their scores the exact same way.

The study ultimately found that LLMs outperformed those with average human creativity. But highly creative humans surpassed LLMs, with gaps widening in the top 25 per cent of participants and further widening in the top 10 per cent of humans.

“There’s been quite a few studies now that have tested this with different models. This one is much more diverse with a much larger human sample,” says Olson.

The study suggests LLMs may be particularly helpful for less creative people when generating ideas, but they may be less helpful for more creative people, Olson says.

The paper’s findings also raise questions about whether LLMs benefit or hinder highly creative people who work in creative fields, says Olson.

“If people who are highly creative use these kinds of models, are they going to be generating less creative ideas?” Olson says. 

“These models seem creative when you work with them, but there’s a big chunk of people that can outperform them on this task. Maybe our creative thinking isn’t something we should be offloading onto these models.”

He adds that the study also reveals the rapid pace of AI development and shows that new models were outperforming their earlier versions.

While Olson says it’s hard to predict, he thinks LLMs could potentially continue to increase in creativity as new models are developed – but that might ultimately level out. 

“There is speculation that the models have already reached either a plateau or slowing of growth, so I guess we will see what happens,” he says. “It’s a field where things change very rapidly.”