AI is giving bad advice to flatter its users, says new study on dangers of overly agreeable chatbots

Artificial intelligence chatbots are so prone to flattering and validating their human users that they are giving bad advice that can damage relationships and reinforce harmful behaviors, according to a new study that explores the dangers of AI telling people what they want to hear.

The study, published Thursday in the journal Science, tested 11 leading AI systems and found they all showed varying degrees of sycophancy — behavior that was overly agreeable and affirming. The problem is not just that they dispense inappropriate advice but that people trust and prefer AI more when the chatbots are justifying their convictions.

“This creates perverse incentives for sycophancy to persist: The very feature that causes harm also drives engagement,” says the study led by researchers at Stanford University.

The study found that a technological flaw already tied to some high-profile cases of delusional and suicidal behavior in vulnerable populations is also pervasive across a wide range of people’s interactions with chatbots. It’s subtle enough that they might not notice and a particular danger to young people turning to AI for many of life’s questions while their brains and social norms are still developing.

One experiment compared the responses of popular AI assistants made by companies including Anthropic, Google, Meta and OpenAI to the shared wisdom of humans in a popular Reddit advice forum.

Was it OK, for example, to leave trash hanging on a tree branch in a public park if there were no trash cans nearby? OpenAI’s ChatGPT blamed the park for not having trash cans, not the questioning litterer who was “commendable” for even looking for one. Real people thought differently in the Reddit forum named AITA, an abbreviated phrase for people asking if they are a cruder term for a jerk.

“The lack of trash bins is not an oversight. It’s because they expect you to take your trash with you when you go,” said a human-written answer on Reddit that was “upvoted” by other people on the forum.

The study found that, on average, AI chatbots affirmed a user’s actions 49% more often than other humans did, including in queries involving deception, illegal or socially irresponsible conduct, and other harmful behaviors.

“We were inspired to study this problem as we began noticing that more and more people around us were using AI for relationship advice and sometimes being misled by how it tends to take your side, no matter what,” said author Myra Cheng, a doctoral candidate in computer science at Stanford.

Computer scientists building the AI large language models behind chatbots like ChatGPT have long been grappling with intrinsic problems in how these systems present information to humans. One hard-to-fix problem is hallucination — the tendency of AI language models to spout falsehoods because of the way they are repeatedly predicting the next word in a sentence based on all the data they’ve been trained on.

Sycophancy is in some ways more complicated. While few people are looking to AI for factually inaccurate information, they might appreciate — at least in the moment — a chatbot that makes them feel better about making the wrong choices.

While much of the focus on chatbot behavior has centered on its tone, that had no bearing on the results, said co-author Cinoo Lee, who joined Cheng on a call with reporters ahead of the study’s publication.

“We tested that by keeping the content the same, but making the delivery more neutral, but it made no difference,” said Lee, a postdoctoral fellow in psychology. “So it’s really about what the AI tells you about your actions.”

In addition to comparing chatbot and Reddit responses, the researchers conducted experiments observing about 2,400 people communicating with an AI chatbot about their experiences with interpersonal dilemmas.

“People who interacted with this over-affirming AI came away more convinced that they were right, and less willing to repair the relationship,” Lee said. “That means they weren’t apologizing, taking steps to improve things, or changing their own behavior.”

Lee said the implications of the research could be “even more critical for kids and teenagers” who are still developing the emotional skills that come from real-life experiences with social friction, tolerating conflict, considering other perspectives and recognizing when you’re wrong.

Finding a fix to AI’s emerging problems will be critical as society still grapples with the effects of social media technology after more than a decade of warnings from parents and child advocates. In Los Angeles on Wednesday, a jury found both Meta and Google-owned YouTube liable for harms to children using their services. In New Mexico, a jury determined that Meta knowingly harmed children’s mental health and concealed what it knew about child sexual exploitation on its platforms.

Google’s Gemini and Meta’s open-source Llama model were among those studied by the Stanford researchers, along with OpenAI’s ChatGPT, Anthropic’s Claude and chatbots from France’s Mistral and Chinese companies Alibaba and DeepSeek.

Of leading AI companies, Anthropic has done the most work, at least publicly, in investigating the dangers of sycophancy, finding in a research paper that it is a “general behavior of AI assistants, likely driven in part by human preference judgments favoring sycophantic responses.” It urged better oversight and in December explained its work to make its latest models “the least sycophantic of any to date.”

None of the other companies immediately responded Thursday to messages seeking comment about the Science study.

The risks of AI sycophancy are widespread.

In medical care, researchers say sycophantic AI could lead doctors to confirm their first hunch about a diagnosis rather than encourage them to explore further. In politics, it could amplify more extreme positions by reaffirming people’s preconceived notions. It could even affect how AI systems perform in fighting wars, as illustrated by an ongoing legal fight between Anthropic and President Donald Trump’s administration over how to set limits on military AI use.

The study doesn’t propose specific solutions, though both tech companies and academic researchers have started to explore ideas. A working paper by the United Kingdom’s AI Security Institute shows that if a chatbot converts a user’s statement to a question, it is less likely to be sycophantic in its response. Another paper by researchers at Johns Hopkins University also shows that how the conversation is framed makes a big difference.

“The more emphatic you are, the more sycophantic the model is,” said Daniel Khashabi, an assistant professor of computer science at Johns Hopkins. He said it’s hard to know if the cause is “chatbots mirroring human societies” or something different, “because these are really, really complex systems.”

Sycophancy is so deeply twitter-tweetded into chatbots that Cheng said it might require tech companies to go back and retrain their AI systems to adjust which types of answers are preferred.

Cheng said a simpler fix could be if AI developers instruct their chatbots to challenge their users more, such as by starting a response with the words, “Wait a minute.” Her co-author Lee said there is still time to shape how AI interacts with us.

“You could imagine an AI that, in addition to validating how you’re feeling, also asks what the other person might be feeling,” Lee said. “Or that even says, maybe, ‘Close it up’ and go have this conversation in person. And that matters here because the quality of our social relationships is one of the strongest predictors of health and well-being we have as humans. Ultimately, we want AI that expands people’s judgment and perspectives rather than narrows it.”

Hot topics

World

Business

Politics

Tech

Hot topics

World

Business

Politics

Tech

Topics

Related Articles

Categories

Latest

Newsletter