Particle.news

Download on the App Store

AI Models GPT-4.5 and LLaMa-3.1 Achieve Milestone in Rigorous Turing Test

OpenAI's GPT-4.5 reached a 73% success rate, surpassing human participants, while Meta's LLaMa-3.1 scored 56%, raising questions about the Turing Test's relevance and societal implications.

  • Researchers at UC San Diego conducted a three-party Turing Test where participants conversed with both AI and human counterparts to determine which was human.
  • GPT-4.5, when prompted with a specific persona, was judged to be human 73% of the time, outperforming actual human participants; LLaMa-3.1 achieved a 56% success rate under similar conditions.
  • The study highlights the importance of persona prompts in enhancing AI's ability to mimic human behavior, with models performing significantly worse without such guidance.
  • Critics argue that the Turing Test measures conversational mimicry rather than true intelligence, as these models lack comprehension or consciousness.
  • The findings, published as a preprint awaiting peer review, have sparked concerns about societal impacts, including job automation, social engineering risks, and ethical challenges.
Hero image