Particle.news
Download on the App Store

Studies Find AI Matches Average Creativity but Trails Top Humans, and Readers Penalize AI Labels

Peer-reviewed results quantify model performance on creativity tests, with a documented disclosure penalty in reader evaluations.

Overview

  • A Journal of Experimental Psychology: General paper aggregating 16 experiments with roughly 27,000 participants found that disclosing AI involvement in creative writing reduced evaluations by an average of 6.2%, and attempts to mitigate the bias did not reliably work.
  • The disclosure findings carry practical implications for creators and platforms as U.S. lawmakers weigh AI disclosure requirements that could unintentionally depress audience appreciation of AI-assisted work.
  • A Scientific Reports study benchmarking leading language models against 100,000 human participants on the Divergent Association Task reported that top systems such as GPT-4 can meet or exceed the human average but remain below higher-performing human groups, especially the top 10%.
  • Model creativity improved with configuration changes like higher temperature and targeted prompting, yet these gains did not close the gap with top human performers, and human-written texts retained an edge in several writing evaluations.
  • Separate research in Patterns showed autonomous text-to-image-to-text loops converged on generic themes dubbed "visual elevator music" without retraining, highlighting a homogenization risk that researchers say may require design choices and human collaboration to preserve variety.