
Anthropic co-founder Ben Mann says true "transformative AI" will arrive only after systems ace what he calls the "economic Turing test."
What Happened: Mann, in a recent appearance on the ‘No Priors’ podcast, defined the “economic Turing test” as a workplace trial that forces hiring managers to choose between a month-long contractor and an AI agent for the same job.
Passing the test would mark "when things start to get really interesting from a societal and cultural standpoint," Mann noted.
Mann's yardstick swaps laboratory benchmarks for a market basket covering "50% of economically valuable tasks." Each human supervisor would "hire an agent" to perform the work. If, at the end of the month, the manager prefers the machine, "then it passed," he said.
Mann does warn that the exercise has its limitations. "Interviews are only a poor approximation of real-world job performance," he observed, dismissing current testing measures as limited and too theoretical.
See also: Larry Ellison Overtakes Jeff Bezos, Mark Zuckerberg To Become World’s Second Richest Amid Oracle’s ‘Watershed’ Moment
Anthropic has already run its Claude models through internal interviews and found them "extremely good," though Mann conceded that the formal trial "hasn't started" and might not come until after the firm's next release cycle. He pegged 2028 as a "quite possible" window for artificial general intelligence but cautioned that precise timelines remain guesswork.
Why It Matters: Mann’s coinage of the phrase “economic Turing test” builds on the Turing Test, which is a simple method of inquiry in artificial intelligence for determining whether or not a computer is capable of thinking like a human being.
OpenAI’s ChatGPT 4 became the first AI LLM to pass a two-player Turing test, fooling human conversation partners 54% of the time back in July 2024. GPT-4.5 achieved a 73% success rate in a more formal test in March earlier this year. However, critics have posed several reasons over the years challenging the accuracy of the test in determining the true intelligence of machines.
A new Wharton study also found that large language models now create memes humans rate as funnier than the average person's, effectively passing the "meme Turing Test."
Anthropic's momentum is accelerating. A March Series E round pushed its valuation to $61.5 billion, positioning the startup backed by Amazon.com Inc. (NASDAQ:AMZN) and Alphabet Inc. (NASDAQ:GOOGL) (NASDAQ:GOOG) as OpenAI's fiercest privately held rival.
Image via Shutterstock
Read next: ChatGPT Kept ‘Begging To Restart’ Before Being Beaten By 1979 Atari 2600 In A ‘Beginner Level’ Chess Showdown
Image via Shutterstock