How a Twitter plan to counter extremism fell apart

A research team was midway through a project to help troubled users. Then Elon Musk bought the company

How a Twitter plan to counter extremism fell apart
(Kristen Radtke / The Verge)

Programming note: Platformer is off Monday for Labor Day.

Zoe Schiffer and I reported this story for The Verge.

It had been a long pandemic for Twitter’s research team. Tasked with solving some of the platform’s toughest problems around harassment, extremism, and disinformation, staffers absconded to Napa Valley in November 2021 for a company retreat. Despite a tumultuous change in leadership — Jack Dorsey had recently stepped down, appointing former chief technology officer Parag Agrawal to take his place — the group felt unified, even hopeful. After months of fighting bad actors online, employees took a moment to unwind. “We finally felt like we had a cohesive team,” one researcher says.

But at the goodbye brunch on the last day, people’s phones started pinging with alarming news: their boss, Dantley Davis, Twitter’s vice president of design, had been fired. Nobody knew it was coming. “It was like a movie,” says one attendee, who asked to remain anonymous because they are not authorized to speak publicly about the company. “People started crying. I was just sitting there eating a croissant being like, ‘What’s up with the mood?’”

The news foreshadowed a downward spiral for the research organization. Although the group was used to reorganizations, a shakeup in the middle of an outing meant to bond the team together felt deeply symbolic.

The turmoil came to a head in April, when Elon Musk signed a deal to buy Twitter. Interviews with current and former employees, along with 70 pages of internal documents, suggest the chaos surrounding Musk’s acquisition pushed some teams to the breaking point, prompting numerous health researchers to quit, with some saying their colleagues were told to to deprioritize projects to fight extremism” in favor of focusing on bots and spam. The Musk deal might not even go through, but the effects on Twitter’s health efforts are already clear.

The health team, once tasked with fostering civil conversations on the famously uncivil platform, went from 15 full-time staffers down to two.


In 2019, Jack Dorsey asked a fundamental question about the platform he had helped create: “Can we actually measure the health of the conversation?”

Onstage at a TED conference in Vancouver, the beanie-clad CEO talked earnestly about investing in automated systems to proactively detect bad behavior and “take the burden off the victim completely.”

That summer, the company began staffing up a team of health researchers to carry out Dorsey’s mission. His talk convinced people who’d been working in academia, or for larger tech companies like Meta, to join Twitter, inspired by the prospect of working toward positive social change. 

When the process worked as intended, health researchers helped Twitter think through potential abuses of new products. In 2020, Twitter was working on a tool called “unmention” that allows users to limit who can reply to their tweets. Researchers conducted a “red team” exercise, bringing together employees across the company to explore how the tool could be misused. Unmention could allow “powerful people [to] suppress dissent, discussion, and correction” and enable “harassers seeking contact with their targets [to] coerce targets to respond in person,” the red team wrote in an internal report. 

But the process wasn't always so smooth. In 2021, former Twitter product chief Kayvon Beykpour announced the company’s number one priority was launching Spaces. (“It was a full on assault to kill Clubhouse,” one employee says.) The team assigned to the project worked overtime trying to get the feature out the door and didn’t schedule a red team exercise until August 10th — three months after launch. In July, the exercise was canceled. Spaces went live without a comprehensive assessment of the key risks, and white nationalists and terrorists flooded the platform, as The Washington Post reported

When Twitter eventually held a red team exercise for Spaces in January 2022, the report concluded: “We did not prioritize identifying and mitigating against health and safety risks before launching Spaces. This Red Team occurred too late. Despite critical investments in the first year and a half of building Spaces, we have been largely reactive to the real-world harms inflicted by malicious actors in Spaces. We have over relied on the general public to identify problems. We have launched products and features without adequate exploration of potential health implications.”

Earlier this year, Twitter walked back plans to monetize adult content after a red team found that the platform had failed to adequately address child sexual exploitation material. It was a problem researchers had been warning about for years. Employees said that Twitter executives have been aware of the problem but noted the company has not allocated the resources necessary to fix it. 


By late 2021, Twitter’s health researchers had spent years playing whack-a-mole with bad actors on the platform and decided to deploy a more sophisticated approach to dealing with harmful content. Externally, the company was regularly criticized for allowing dangerous groups to run amok. But internally, it sometimes felt as though certain groups, like conspiracy theorists, were kicked off the platform too soon — before researchers could study their dynamics.

“The old approach was almost comically ineffective, and very reactive — a manual process of playing catch,” says a former employee, who asked to remain anonymous because they are not authorized to speak publicly about the company. “Simply defining and catching ‘bad guys’ is a losing game.”

Instead, researchers hoped to identify people who were about to engage with harmful tweets, and nudge them toward healthier content using pop-up messages and interstitials. “The pilot will allow Twitter to identify and leverage behavioral — rather than content — signals and reach users at risk from harm with redirection to supportive content and services,” read an internal project brief, viewed by The Verge.

Twitter researchers partnered with Moonshot, a company that specializes in studying violent extremists, and kicked off a project called Redirect, modeled after work that Google and Facebook had done to curb the spread of harmful communities. At Google, this work had resulted in a sophisticated campaign to target people searching for extremist content with ads and YouTube videos aimed at debunking extremist messaging. Twitter planned to do the same. 

The goal was to move the company from simply reacting to bad accounts and posts to proactively guiding users toward better behavior.

Read the rest of this story at The Verge.


Governing


Industry


Those good tweets


Talk to me

Send me tips, comments, questions, and edited tweets: casey@platformer.news.