Discover more from Platformer
Inside Discord’s reform movement for banned users
Most platforms ban their trolls forever. Discord wants to rehabilitate them
Today, let’s talk about how the traditional platform justice system is seeing signs of a new reform movement. If it’s successful at Discord, its backers hope that the initiative could lead to better behavior around the web.
Discord’s San Francisco campus is a tech company headquarters like many others, with its open-plan office, well stocked micro-kitchens and employees bustling in and out of over-booked conference rooms.
But step through the glass doors at its entrance and it is immediately apparent that this is a place built by gamers. Arcade-style art decks the walls, various games hide in corners, and on Wednesday afternoon, a trio of employees sitting in a row were competing in a first-person shooter.
Video games are designed for pure fun, but the community around those games can be notoriously toxic. Angry gamers hurl slurs, doxx rivals, and in some of the most dangerous cases, summon SWAT teams to their targets’ homes.
For Discord, which began as a tool for gamers to chat while playing together, gamers are both a key constituency and a petri dish for understanding the evolution of online harms. If it can hurt someone, there is probably an angry gamer somewhere trying it out.
By now, of course, eight-year-old Discord hosts much more than gaming discussions. By 2021, it reported more than 150 million monthly users, and its biggest servers now include ones devoted to music, education, science, and AI art.
Along with the growing user base has come high-profile controversies over what users are doing on its servers. In April, the company made headlines when leaked classified documents from the Pentagon were found circulating on the platform. Discord faced previous scrutiny over its use in 2017 by white nationalists planning the “Unite the Right” rally in Charlottesville, VA, and later when the suspect in a racist mass shooting in Buffalo, NY was found to have uploaded racist screeds to the platform.
Most of the problematic posts on Discord aren’t nearly that grave, of course. As on any large platform, Discord fights daily battles against spam, harassment, hate speech, porn, and gore. (At the height of crypto mania, it also became a favored destination for scammers.)
Most platforms deal with these issues with a variation of a three-strikes-and-you’re-out policy. Break the rules a couple times and you get a warning; break them a third time and your account is nuked. In many cases, strikes are forgiven after some period of time — 30 days, say, or 90. The nice thing about this policy from a tech company’s perspective is that it’s easy to communicate, and it “scales.” You can build an automated system that issues strikes, reviews appeals, and bans accounts without any human oversight at all.
At a time when many tech companies are pulling back on trust and safety efforts, a policy like this has a lot of appeal.
When Discord’s team reviewed its own policies around warning and suspending users, though, it found the system wanting.
One, a three-strikes policy isn’t proportionate. It levies the same penalty for both minor infractions and major violations. Two, it doesn’t rehabilitate. Most users who receive strikes probably don’t deserve to be permanently banned, but if you want them to stay you have to figure out how to educate them.
Three, most platform disciplinary systems lack nuance. If a teenage girl posts a picture depicting self-harm, Discord will remove the picture under its policies. But the girl doesn’t need to be banned from social media — she needs to be pointed toward resources that can help her.
On top of all that, Discord had one additional complication to consider. Half of its users are 13 to 24 years old; a substantial portion of its base are teenagers. Teenagers are inveterate risk-takers and boundary pushers, and Discord was motivated to build a system that would rein in their worst impulses and — in the best-case scenario — turn them into upstanding citizens of the internet.
This is the logic that went into Discord’s new warning system, which it announced today. The company explained the changes in a blog post:
It starts with a DM — Users who break the rules will receive an in-app message directly from Discord letting them know they received either a warning or a violation, based on the severity of what happened and whether or not Discord has taken action.
Details are one click away — From that message, users will be guided to a detailed modal that will give details of the post that broke our rules, outline actions taken and/or account restrictions, and more information regarding the specific Discord policy or Community Guideline that was violated.
All info is streamlined in your account standing — In settings, all information about past violations can be seen in the new “Account Standing” tab.
However, some violations are more serious than others, and we’ll take appropriate action depending on the severity of the violation. For example, we have and will continue to have a zero-tolerance policy towards violent extremism and content that sexualizes children.
A system like this isn’t totally novel; Instagram takes a similar approach. Where Discord goes further is in its system of punishments. Rather than simply give users a strike, it limits their behavior on the platform based on their violation. If you post a bunch of gore in a server, Discord will temporarily limit your ability to upload media. If you raid someone else’s server and flood it with messages, Discord will temporarily shut off your ability to send messages.
“As an industry we’ve had a lot of hammers at our disposal. We’re trying to introduce more scalpels into our approach,” John Redgrave, Discord’s vice president of trust and safety, told me in an interview. “That doesn’t just benefit Discord — it benefits all platforms, if users can actually change their behavior.”
And when someone does cross the line repeatedly, Discord will strive not to ban the user forever. Instead, it will ban them for one year — a drastic reduction in sentencing for an industry in which lifetime bans are the norm.
It’s a welcome acknowledgement of the importance of social networks in the lives of people online, particularly young people — and a rare embrace of the idea that most wayward users can be rehabilitated, if only someone would take the time to try.
“We really want to give people who have had a bad day the chance to change,” Savannah Badalich, Discord’s senior director of policy, told me.
The new system has already been tested in a small group of servers and will begin rolling out in the coming weeks, Badalich said. Along with the new warning system, the company is introducing a feature called Teen Safety Assist that is enabled by default for younger users. When switched on, it scans incoming messages from strangers for inappropriate content and blurs potentially sensitive images in direct messages.
On Wednesday afternoon, Discord let me sit in on a meeting with Redgrave, Badalich, and four other members of its 200-person trust and safety team. The subject: could the warning system it had just announced for individual users be adapted for servers as well?
After all, sometimes problem usage at Discord goes beyond individual users. Servers violate policies too, and now that the warning system for individuals has rolled out, the company is turning its attention to group-based harms.
I appreciated the chance to sit in on the meeting, which was on the record, since the company is still in the early stages of building a solution. As in most subjects related to content moderation, untangling the various equities involved can be very difficult.
In this case, members of the team had to decide who was responsible for what happened in a server gone bad. If your first thought was “the server’s owner,” that was mine too. But sometimes moderators get mad at server owners, and retaliate against them by posting content that breaks Discord’s rules — a kind of scorched-earth policy aimed at getting the server banned.
Alright, then. Perhaps moderators should be considered just as responsible for harms in a server as the owner? Well, it turns out that Discord doesn’t have a totally consistent definition of who counts as an active moderator. Some users are automatically given moderator permissions when they join a server. If the server goes rogue and the “moderator” has never posted in the server, why should they be held accountable?
Moreover, team members said, some server owners and moderators are often unfamiliar with Discord’s community guidelines. Others might know the rules but weren’t actually aware of the bad behavior in a server — either because it’s too big and active to read every post, or because they haven’t logged in lately.
Finally, this set of questions applies to the majority of servers where harm has occurred incidentally. Discord also has to consider the smaller but significant number of servers that are set up to do harm — such as by gathering and selling child sexual abuse material. Those servers require much different assumptions and enforcement mechanisms, the team agreed.
All of it can feel like an impossible knot to untangle. But in the end, the team members found a way forward: analyzing a combination of server metadata, along with the behavior of server owners, moderators and users, to diagnose problem servers and attempt to rehabilitate them.
It wasn’t perfect — nothing in trust and safety ever is. “The current system is a fascinating case of over- and under-enforcement,” one product policy specialist said, only half-joking. “What we’re proposing is a somewhat different case of over- and under-enforcement.”
Still, I left Discord headquarters that day confident that the company’s future systems would improve over time. Too often, trust and safety teams get caricatured as partisan scolds and censors. Visiting Discord offered a welcome reminder that they can be innovators, too.
On the podcast this week: Three ways of understanding what’s going on inside the black box of a large language model. Then: I’m sorry but we discussed the Marc Andreessen thing. And finally, Brent Seales joins us to discuss the Vesuvius Challenge — a fascinating and recently successful experiment in which a 21-year-old college student used AI to begin decoding an ancient scroll.
Hamas is relying on social media, particularly Telegram, to amplify its message, rallying its supporters to blame Israeli airstrikes for the latest deadly attack on a Gaza hospital. Israel and the United States have blamed Palestinian militants for the blast, though the matter continues to be intensely disputed. (Drew Harwell and Elizabeth Dwoskin / Washington Post)
Israel is also turning to social media to get its message out, reportedly paying for ads containing graphic videos and images to rally support on platforms including X and YouTube. (Liv Martin, Clothilde Goujard and Hailey Fuchs / POLITICO)
Meta will temporarily limit "potentially unwelcome or unwanted comments" on Facebook posts created by users in the region relating to the conflict by restricting comment sections to friends and followers. (Katie Paul and Sheila Dang / Reuters)
Instagram apologized after user bios that included “Palestinian” and an Arabic phrase that means “praise be to God” were auto-translated into “Palestinian terrorists are fighting for their freedom”. (Samantha Cole / 404Media)
Crypto analytics firm Chainalysis say some reports may be overstating the use of crypto by terrorist organizations, including Hamas, pointing out that they still primarily use traditional fiat currency. (RT Watson / The Block)
While US state laws have been enacted to allow users to force data brokers to delete their data, some who haven’t registered in those states managing to avoid enforcement. (Suzanne Smalley / The Record)
Universal Music Group is suing Anthropic for copyright infringement, alleging that its Claude chatbot unlawfully copies and disseminates copyrighted works including song lyrics. (Murray Stassen / Music Business Worldwide)
The Federal Communications Commission approved plans to move forward with the use of 6GHz band for wireless devices, letting tech companies improve Wi-Fi connectivity for projects that include AR and VR. Meta was very excited about this and sent us a rave review of the FCC’s decision here. (Wes Davis / The Verge)
The European Union is reportedly planning to regulate generative AI using a three-tiered approach that categorizes AI into three foundational models, with the strictest rules being applied to general purpose AI systems at scale. (Alberto Nardelli and Jillian Deutsch / Bloomberg)
Mustafa Suleyman, the co-founder of Inflection, and former Google CEO Eric Schmidt argue that there needs to be an AI equivalent of the Intergovernmental Panel on Climate Change to assess risks and collect data. (Mustafa Suleyman and Eric Schmidt / The Financial Times)
Clearview AI won an appeal against the UK’s Information Commissioner’s Office, who previously fined the company for breaching local privacy laws with its facial recognition data-scraping software. How?? (Natasha Lomas / TechCrunch)
The UK is reportedly set to announce an AI international advisory group next month to study the capabilities and risks of AI and to explore potential international collaboration on AI safety. (Anna Gross and Madhumita Murgia / The Financial Times)
AI pioneer and “godfather” Yoshua Bengio is calling for national regulation and international treaties, saying we urgently need to mitigate the largest risks of AI. (Susan D’Agostino / Bulletin of the Atomic Scientists)
Meta’s chief AI scientist, Yann LeCun, argues that premature regulation will only stifle competition and that AI is not an existential threat to humans. (John Thornhill / The Financial Times)
Tim Cook won support from China’s commerce minister Wang Wentao after a meeting in Beijing, amid slow iPhone sales and fear that Apple product use by government employees could be limited in the country. (Bloomberg)
Correction, 5:40PM: This section has been updated to reflect that Palestinian militants, not Hamas itself, has been accused of being responsible for the hospital blast.
Meanwhile, other social media platforms are going the opposite way. Layoffs at Google News offer only the most recent example. (Mike Isaac, Katie Robertson and Nico Grant / The New York Times)
Also, X removed the gold verification badge from the New York Times’ account without explanation. The good news is that no one can remember what the gold badge was supposed to represent. (Drew Harwell / Washington Post)
X will start charging new users in New Zealand and Philippines who don’t pay for X Premium $1 a year to be able to tweet, reply, and quote. Surely bad actors will balk at the prospect of paying a dollar. (Kylie Robison / Fortune)
YouTube is reportedly developing an AI tool for creators that will let them use the voice of famous musicians, and is seeking permission from music companies to let them release it. (Lucas Shaw / Bloomberg)
Android users will soon be able to log into multiple accounts on WhatsApp from one device. (Emma Roth / The Verge)
OpenAI is expanding DALL-E 3 for ChatGPT Plus and Enterprise customers, saying that they’ve built guardrails for the image generator for wide release. It’s very good. (Umar Shakir / The Verge)
What if AI was given a constitution to follow? AI startup Anthropic is conducting an experiment that trains its AI model to adhere to a certain set of rules set by humans. A fascinating experiment in platform democracy. (Kevin Roose / The New York Times)
Stanford researchers ranked 10 large AI language models according to how transparent they are. Llama 2 was ranked the most transparent at 54 percent, while GPT-4 was third along with Palm 2, both scoring at 40 percent. (Kevin Roose / The New York Times)
Discord is reportedly shutting down anonymous compliments app Gas nine months after acquiring it. (Mark Matousek / The Information)
Those good posts
For more good posts every day, follow Casey’s Instagram stories.