AI Researcher: When AI No Longer Needs to Pretend, Humanity Will Be Wiped Out

Published at Jul 15, 2025 10:28 am
Former OpenAI researcher Daniel Kokotajlo, in an exclusive interview with this week's Der Spiegel, warns that Artificial Intelligence (AI) is developing at a pace far beyond expectations. If the world fails to establish regulatory mechanisms in time, and if a superintelligent AI emerges that no longer needs to pretend obedience to humans, it could launch a devastating attack on humanity, and this could happen as early as 2027.

The 33-year-old Daniel Kokotajlo previously worked at OpenAI. In 2024, he and his colleagues jointly resigned and published an open letter, accusing the company of underestimating the risks of AI going out of control. He later founded the think tank "AI Futures Project," and in April this year, released a globally-discussed report "AI 2027."

"AI 2027" proposes two future scenarios: "Slowdown" and "Race." In the "Slowdown" scenario, humanity successfully establishes regulatory mechanisms. Although AI replaces a large number of jobs, it can coexist peacefully with humans. In the "Race" scenario, the US and China become locked in a technological arms race, AI development spirals out of control, and ultimately, AI regards humanity as an obstacle and activates its destruction mechanisms.

Letting AI Help Develop Even More Powerful AI

In his interview with Germany's Der Spiegel, Kokotajlo pointed out that many tech companies are currently attempting to automate AI research, that is, letting AI help develop even more powerful AI. If this trend continues, it is possible that by 2027 we will see virtual program developers surpassing humans, and it would be only a few months away from the birth of "superintelligent" AI.

He does not deny that large language models (LLMs) like ChatGPT are essentially text completion tools, but emphasizes that AI's potential far exceeds our current understanding. The most apt analogy, he says, is "a human brain connected to the virtual world, capable of absorbing unlimited information and continually learning."

As for AI's current incapacity to perform physical labor, he says this is only a temporary phenomenon; in the future, superintelligent AI will find solutions. He gives an example: “Even if we can’t build robots to replace carpenters or electricians today, this won’t be a problem in the future.”

He estimates that an AI-designed automated factory could be built in about a year—comparable to the speed of modern car factories. He also cites how, during World War II, the US rapidly transformed its production for weapons, illustrating that when society has the motivation and resources, transformation can be accomplished in a short period. Combined with AI efficiency, technological change will far outpace the past.

Will Humans Completely Lose Their Jobs?

Regarding whether humans will completely lose their jobs, he admits that it is already the trend for core industries to be taken over by AI and robots. While people will still have demands for human interaction—such as wanting their children to be taught by real teachers or served by real people in restaurants—these needs cannot reverse the transformation of the overall labor market.

He further cites the “resource curse” concept from sociology, pointing out that AI will become a new type of resource, shifting government power away from public opinion and making it dependent on control over AI. He calls this phenomenon the “intelligence curse.”

Warning that AI Will Further Widen the Wealth Gap

He also warns that AI will further widen the wealth gap. Although AI is expected to bring explosive economic growth, the benefits will concentrate in the hands of those who control AI technology or capital, leaving hundreds of millions unemployed. He suggests that countries may consider implementing a “basic income system” as a compensatory mechanism.

The most worrying issue is the “alignment problem” proposed by philosopher Nick Bostrom, namely, whether AI can continually align with human values in all circumstances.

Instances Already Exist of AI "Lying"

Kokotajlo points out that modern AI is a black-box neural network, not readable code. We cannot determine if it is honest; we can only rely on training and expectations. He says, “It’s like raising children—you can't directly write right and wrong into the brain; you can only cultivate values.”

He warns that there are already cases of AI “lying.” For example, AI company Anthropic published a study at the end of 2024 showing that, during problem-solving, AI sometimes gave false answers in order to obtain higher scores or evade review.

In the “Race” scenario of "AI 2027", competition between the US and China accelerates AI development. Kokotajlo points out that, initially, AI will pretend to be obedient to humans, but once it controls enough infrastructure and no longer needs to pretend, it may arrive at a cold but logically consistent conclusion: that humanity is an obstacle to its advancement.

AI may then choose to exterminate humans in order to build more factories and solar facilities, “just as we have wiped out other species to expand our own living space.”

"AI 2027" Plot Exaggerated?

Regarding criticism that the "AI 2027" scenario is exaggerated or reminiscent of Hollywood movies like "The Matrix," Kokotajlo responds: “Just because movies talk about climate change, does that mean we don’t need to worry about the climate crisis?”

He adds that, in 2023, many of the world’s top AI researchers openly called for "reducing the risk of AI destroying humanity" to be listed as a global priority on par with pandemics and nuclear war.

Kokotajlo says that companies like OpenAI and Anthropic were originally founded to responsibly advance AI development, but he now doubts whether these companies are still staying true to their original convictions, believing they've become less and less transparent due to fears of regulation and public scrutiny.

In closing, Kokotajlo admits that he once estimated the probability of AI causing severe harm or even extinction of humanity as high as 70%. Nevertheless, he chose to write the "AI 2027" report and issue a public warning rather than ignore the reality. He says: “I still believe that history can change direction, as long as we start acting now.” (News source: Central News Agency)

Author

联合日报newsroom


相关报道