Skip to main navigation Skip to search Skip to main content

People cannot distinguish GPT-4 from a human in a Turing test

  • University of California at San Diego

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Scopus citations

Abstract

AI systems that can fool people into thinking that they are human could have widespread social and economic consequences. In order to measure this ability, we evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or an AI, and judged whether or not they thought their interlocutor was human. GPT-4 was judged to be a human 54% of the time, significantly more often than ELIZA (22%) but less often than actual humans (67%). In order to test the generalizability of our results, we replicated the study on a second population (undergraduate students) and found that the same prompt with GPT-4o achieved a pass rate of 77%, slightly higher than the human pass rate of 71%. On some interpretations, the results provide the first robust empirical demonstration that any artificial system passes an interactive 2-player Turing test. The results have implications for debates around machine intelligence and, more urgently, suggest that deception by current AI systems may go undetected. Analysis of participants' strategies and reasoning suggests that stylistic and socio-emotional factors play a larger role in passing the Turing test than traditional notions of intelligence. We release the full transcripts of the replication data to enable further investigation of human-AI interaction dynamics and deception.

Original languageEnglish
Title of host publicationACMF AccT 2025 - Proceedings of the 2025 ACM Conference on Fairness, Accountability,and Transparency
PublisherAssociation for Computing Machinery, Inc
Pages1615-1639
Number of pages25
ISBN (Electronic)9798400714825
DOIs
StatePublished - Jun 23 2025
Event8th Annual ACM Conference on Fairness, Accountability, and Transparency, FAccT 2025 - Athens, Greece
Duration: Jun 23 2025Jun 26 2025

Publication series

NameACMF AccT 2025 - Proceedings of the 2025 ACM Conference on Fairness, Accountability,and Transparency

Conference

Conference8th Annual ACM Conference on Fairness, Accountability, and Transparency, FAccT 2025
Country/TerritoryGreece
CityAthens
Period06/23/2506/26/25

Keywords

  • Turing test
  • deception
  • human-AI interaction
  • interactive evaluation
  • large language models
  • sociotechnical safety

Fingerprint

Dive into the research topics of 'People cannot distinguish GPT-4 from a human in a Turing test'. Together they form a unique fingerprint.

Cite this