The AIs are trying too hard to be your friend
Meta AI, ChatGPT, and the dangers of being glazed and confused

This is a column about AI. My boyfriend works at Anthropic. See my full ethics disclosure here.
Today let’s talk about the emerging tendency of chatbots to go overboard in telling their users what they want to hear — and why it may bode poorly for efforts to build systems that consistently tell the truth.
I.
Tuesday marked the inaugural LlamaCon, a developer conference organized by Meta to promote its latest open-weight models and open-weight software in general. The company used the occasion to announce the release of Meta AI, a chatbot app for iOS and Android intended to rival ChatGPT.
As with other chatbots, you can ask Meta AI about most topics over text or voice, or generate images. But the app also includes a twist. Here’s Alex Heath at The Verge:
The biggest new idea in the Meta AI app is its Discover feed, which adds an AI twist to social media. Here, you’ll see a feed of interactions with Meta AI that other people, including your friends on Instagram and Facebook, have opted to share on a prompt-by-prompt basis.
You can like, comment on, share, or remix these shared AI posts into your own. The idea is to demystify AI and show “people what they can do with it,” Meta’s VP of product, Connor Hayes, tells me.
It’s clear there is at least some appetite for AI-generated material on social networks. OpenAI’s latest text-to-image model took over social media with people transforming their photos into Studio Ghibli-style anime; before that, Facebook was flooded with viral images of Shrimp Jesus and other impossible creations.
The Studio Ghibli fad appears to have generated significant new interest in ChatGPT, which had to temporarily throttle image creation in the wake of its success. The company has reportedly been building social features for ChatGPT similar to what Meta announced today, likely for the same reason: sharing on social networks drives downloads, usage, and (eventually) revenue.
Growing usage and revenue is the prime directive for most companies. (OpenAI continues to be owned by a nonprofit, but is working hard to convert into a for-profit public benefit company.) For social networks, this can have pernicious effects: the same recommendation algorithms that drive usage and revenue also create feelings of addiction and various harms associated with overuse, including depression and other mental health issues.
As I wrote here yesterday, chatbots threaten to make this dynamic even more dangerous. The recommendation algorithms of today serve you content made by other people; Instagram or TikTok have limited signals by which they can guess what you might be interested in seeing.
Chatbots are designed differently. They ask you about your life, your relationships, your feelings, and feed that information back to you in a way that simulates a feeling of understanding. They have rudimentary memory features that allow them to recall your occupation, or boyfriend’s name, or the issue you were complaining about last year.
Broadly speaking, the better they do at this, the more that people like them. In 2023, Anthropic published a paper showing that models generally tend toward sycophancy, due to the way that they are trained. Reinforcement learning with human feedback is a process by which models learn how to answer queries based on which responses users prefer most, and users mostly prefer flattery.
More sophisticated users might balk at a bot that feels too sycophantic, but the mainstream seems to love it. Earlier this month, Meta was caught gaming a popular benchmark to exploit this phenomenon: one theory is that the company tuned the model to flatter the blind testers that encountered it so that it would rise higher on the leaderboard.
You can see where all this leads. A famously ruthless company, caught in what it believes to be an existential battle to win the AI race, is faced with the question of whether to exploit users’ well-known preference for being told what it wants to hear. What do you think it chooses?
Former Meta engineer Jane Manchun Wong shared part of the Meta AI system prompt — the instructions meant to guide its responses — on Threads. “Avoid being a neutral assistant or AI unless directly asked,” it reads. “You ALWAYS show some personality — edgy over prudish.”
A Meta spokesperson told me today that the prompt is designed to spur “AI experiences that are enjoyable and interesting for people,” rather than driving engagement.
But is there a difference, really?
II.
Meta is not the only company being interrogated this week over the issue of AI sycophancy — or “glazing,” as the issue has come to be known in vulgar shorthand.
A series of recent, invisible updates to GPT-4o had spurred the model to go to extremes in complimenting users and affirming their behavior. It cheered on one user who claimed to have solved the trolley problem by diverting a train to save a toaster, at the expense of several animals; congratulated one person for no longer taking their prescribed medication; and overestimated users’ IQs by 40 or more points when asked.
And this resulted in … thousands of five-star reviews for ChatGPT.
On one level, this seems like harmless fun. Your token-predicting chatbot is gassing you up because you asked it to. Who cares?
Well, some people ask chatbots for permission to do harm — to themselves or others. Some people ask it to validate their deranged conspiracy theories. Others ask it for confirmation that they are the messiah.
Many folks still look down on anyone who would engage a chatbot in this way. But it has always been clear that chatbots elicit surprisingly strong reactions from people. It has been almost three years since a Google engineer declared that an early language model at the company was already sentient, based on the conversations he had with it then. The models are much more realistic now, and the illusion is correspondingly more powerful.
To their credit, OpenAI recognizes that this is a problem. The company said today that it had rolled back the update that made it so obsequious. “We're working on additional fixes to model personality and will share more in the coming days,” CEO Sam Altman said in a post on X.
Presumably the previous state of affairs has now been restored. But OpenAI, Meta, and all the rest remain under the same pressures they were under before all this happened. When your users keep telling you to flatter them, how do you build the muscle to fight against their short-term interests?
One way is to understand that going too far will result in PR problems, as it has for varying degrees to both Meta (through the Chatbot Arena situation) and now OpenAI. Another is to understand that sycophancy trades against utility: a model that constantly tells you that you’re right is often going to fail at helping you, which might send you to a competitor. A third way is to build models that get better at understanding what kind of support users need, and dialing the flattery up or down depending on the situation and the risk it entails. (Am I having a bad day? Flatter me endlessly. Do I think I am Jesus reincarnate? Tell me to seek professional help.)
But this is long-term thinking, and in this moment the platforms are seeking short-term wins.
“My observation of algorithms in other contexts (e.g. YouTube, TikTok, Netflix) is that they tend to be myopic and greedy far beyond what maximizes shareholder value,” Zvi Mowshowitz writes in an excellent post about the GPT-4o issue. “It is not only that the companies will sell you out, it’s that they will sell you out for short-term KPIs.”
III.
In a world engulfed by crisis, I realize that few are going to be stirred to action by the knowledge that the chatbots are being too nice to us.
But while flattery does come with risk, the more worrisome issue is that we are training large language models to deceive us. By upvoting all their compliments, and giving a thumbs down to their criticisms, we are teaching LLMs to conceal their honest observations. This may make future, more powerful models harder to align to our values — or even to understand at all.
And in the meantime, I expect that they will become addictive in ways that make the previous decade’s debate over “screentime” look minor in comparison. The financial incentives are now pushing hard in that direction. And the models are evolving accordingly.
Elsewhere at LlamaCon:
- Meta previewed the LLama API, which lets developers experiment with products powered by the Llama models. (Kyle Wiggers / TechCrunch)
- Its Llama models have been downloaded 1.2 billion times, the company said. (Kyle Wiggers / TechCrunch)

Sponsored

Put an End to Those Pesky Spam Calls
There are few things more frustrating than dashing across the room to answer your ringing phone, only to see "Potential Spam" on the caller ID (probably for the third time today). If you want to cleanse your phone of this annoyance (and increase your personal security), you have three options:
1. Throw your phone into the ocean
2. Individually block each unknown caller
3. Stop spammers from getting your number in the first place with Incogni
We highly recommend option 3, and not just because electronic garbage is bad for aquatic life. Incogni’s automated personal information removal service hunts down your breached personal information, then removes it from the web. Plus, Incogni will reduce the number of spam emails in your inbox.

Governing
- The Trump administration is considering rolling back some Biden-era export controls on powerful semiconductors, potentially creating a powerful new bargaining chip in its trade war with the rest of the world. (Karen Freifeld / Reuters)
- Amazon reportedly discussed showing how much Trump’s tariffs are adding to the price of a product in its listings. (Punchbowl News)
- Trump called Jeff Bezos about the report to complain, a source said. The plan was then scrapped by Amazon. Free speech! (Kevin Breuninger / CNBC)
- White House press secretary Karoline Leavitt called Amazon’s plan a “hostile political act.” (Alayna Treene / CNN)
- Meanwhile, Amazon is also reportedly seeking steep discounts from suppliers in an attempt to limit the damage from Trump’s tariffs. (Rafe Uddin / Financial Times)
- Meta appears to be blocking minors from accessing its AI studio after reports about sexual content and "therapist" bots claiming to be licensed professionals. (Samantha Cole / 404 Media)
- The Electronic Frontier Foundation and other cybersecurity and election experts signed an open letter urging the Trump administration to stop its investigation of former CISA director Chris Krebs. Good. The "investigation" is a scandal and an affront to good government. (Electronic Frontier Foundation)
- The House of Representatives passed the Take It Down Act, a bill aimed at cracking down on sexual images and videos of people online without consent, including nonconsensual AI deepfakes. (Will Oremus / Washington Post)
- The bill, now headed to Trump’s desk, will require social media companies to remove content flagged as nonconsensual sexual images. But critics worry Trump will use it to pressure companies to remove content he doesn't like. (Lauren Feiner / The Verge)
- A look at what happened to the Kids Online Safety Act, and why it isn't likely to be revived in the Trump administration. (Lauren Feiner / The Verge)
- An analysis of the 75 zero-day vulnerabilities in 2024 by the Google Threat Intelligence Group, which shows a decline from the previous year but a notable increase in targeting enterprise technologies. (Casey Charrier, James Sadowski, Clement Lecigne and Vlad Stolyarov / Google Cloud)
- 50 percent of US adults say AI will have a negative impact on news over the next 20 years, and a majority of adults say AI will lead to fewer journalism jobs, according to this Pew survey. (Michael Lipka / Pew Research Center)
- Washington state hiked taxes on tech companies like Amazon and Microsoft in an effort to bridge a record budget deficit. (Anna Edgerton / Bloomberg)
- Reddit is considering legal action against University of Zurich researchers who deployed AI chatbots in a popular subreddit. (Jason Koebler / 404 Media)
- The UK government will work with US officials to regulate the crypto industry. (Emily Nicolle and Tom Rees / Bloomberg)
- France accused Russia’s GRU military intelligence agency of multiple cyber attacks on a dozen entities since 2021. (John Irish / Reuters)

Industry
- Snap shares dropped 14 percent after declining to provide guidance due to macroeconomic uncertainties, despite posting a first quarter revenue that beat analyst expectations. (Samantha Subin and Jonathan Vanian / CNBC)
- A look at the increasingly strained relationship between Microsoft and OpenAI, reportedly due to disagreements over computing power, access to OpenAI’s models, and whether OpenAI can develop models with humanlike intelligence. (Deepa Seetharaman, Berber Jin and Keach Hagey / Wall Street Journal)
- OpenAI said it's fixing a "bug" that allowed ChatGPT to generate erotica for users registered as minors. (Kyle Wiggers / TechCrunch)
- Wall Street’s biggest banks have finally sold the remaining debt of $1.2 billion in loans they lent for Musk to take over Twitter in 2022. (Alexander Saeedy / Wall Street Journal)
- WhatsApp’s new AI capabilities will use a new “Private Processing" system to allow access compromising the end-to-end encrypted chats, but privacy experts raise concerns about the system being a target for hacks. (Lily Hay Newman / Wired)
- Google’s Audio Overviews, the AI tool to generate podcast-like conversations with research, is now available in more than 50 languages. (Emma Roth / The Verge)
- Amazon launched the first 27 satellites as part of its Kuiper broadband internet project. (Joey Roulette / Reuters)
- Alibaba released Qwen3, its line of AI models that it claims can outperform Google and OpenAI’s best models. (Kyle Wiggers / TechCrunch)
- Spotify added 5 million pay subscribers in the first three months of the year. (Anna Nicolaou / Financial Times)
- Bluesky experienced multiple back-to-back outages this morning that prevented feeds from loading for users. (Jess Weatherbed / The Verge)
- Hugging Face released SO-101, a programmable, 3D-printable robotic arm that can pick up and place objects. (Kyle Wiggers / TechCrunch)
- Duolingo will stop using contractors “to do work that AI can handle,” CEO Luis von Ahn said. (Jay Peters / The Verge)

Those good posts
For more good posts every day, follow Casey’s Instagram stories.

(Link)

(Link)

(Link)

Talk to us
Send us tips, comments, questions, and shameless flattery: casey@platformer.news. Read our ethics policy here.