Creativity Benchmark

No clear model winner.

Model rankings shift by brand, culture, and task.
LLMs aren’t good judges.

Machines showed strong bias toward reasoning models and were 5–10× more confident than humans
Variance matters.

Some models deliver broader creative spread than others
Best practice

Use models for volume, humans for selection

"For those of us still grappling with our professional identity in an AI era, it’s reassuring to know that ultimately creativity is still, in large part, a human pursuit. While LLMs can help us get creative results faster, we are still the directors in this human-AI collaboration process. "

Lieu Thi Pham, Informa TechTarget