How OpenAI is building a path toward AI agents

Building a GPT-based copy editor showcases their promise — but the risks ahead are real

How OpenAI is building a path toward AI agents
OpenAI CEO Sam Altman builds a custom GPT during the OpenAI DevDay event Monday in San Francisco. (Justin Sullivan / Getty Images)

Today, let’s talk about the implications of OpenAI’s announcements at its inaugural developer day, which I just attended in-person. The company is pushing hard to embed artificial intelligence in more aspects of everyday life, and is doing so in increasingly creative ways. It’s also tiptoeing up to some very important questions about how much agency our AI systems should have.

I.

At a packed event in San Francisco, OpenAI CEO Sam Altman announced a suite of features for the GPT API that developers have been clamoring for: a larger context window, enabling analysis of up to 300 or so pages of text; an updated knowledge cutoff that brings ChatGPT’s “world knowledge” to April of this year; and legal protections for developers who are sued on copyright grounds for their usage of the API.

The event’s marquee new feature, though, is what the company is calling GPTs: more narrowly tailored versions of the company’s flagship, ChatGPT. In their early beta form, GPTs can draw on custom instructions and uploaded files that carry context that ChatGPT wouldn’t have. Altman, for example, built a GPT during a demo on stage that offers startup advice. He uploaded a lecture he once gave on the subject; the resulting GPT will now use that advice in making suggestions. 

GPTs can also be connected to actions on third-party platforms. In one demo, Altman mocked up a poster in the design app Canva using the ChatGPT interface. In another, an OpenAI solutions architect used a GPT linked to Zapier to scan her calendar for scheduling conflicts and then automatically send a message about a conflict in Slack. 

Custom chatbots aren’t a new idea; Character.ai has grown popular within a certain crowd for enabling users to talk to every fictional character under the sun, along with imagined versions of real-life figures. Replika is turning custom chatbots into personalized (and sometimes romantic) companions.

Where OpenAI’s approach stands apart is in its focus on utility. It’s seeking to become the AI bridge between all sorts of online services, turning them from the glorified copywriters they are today into true virtual assistants, coaches, tutors, lawyers, nurses, accountants, and more. Getting there will require better models, products, policies, and probably some regulation. But the vision is there in plain sight.

II.

After the event, I came home and built a GPT.

OpenAI had granted me beta access to the feature and encouraged me to play around with it. I decided to build a copy editor. 

For a long time, I alone edited Platformer, which was probably evident to the many of you who (wonderfully!) emailed over the years pointing out typos and grammatical errors. Last year, Zoë Schiffer joined as managing editor, and she fixes countless mistakes in my columns every day that we publish.

Still, as at any publication, mistakes sneak through. A few weeks ago a reader wrote in to ask: why not run your columns through ChatGPT first?

The intersection of news and generative AI is a fraught topic. When generative AI is used to create cheap journalistic outputs SEO-bait explainers, engagement-bait polls — the results are often (mostly?) terrible.  

What this reader suggested is that we use AI as one of many inputs. We still report, write, and edit the columns as normal. But before sending out the newsletter, we could ask the AI to double-check our work. 

For the past few weeks, I’ve been using ChatGPT to look for possible spelling and grammatical errors. And while I feel a little insecure saying so, the truth is that GPT-4 does a pretty good job. It’s actually rather too conservative for my taste — about seven out of every 10 things it suggests that I change aren’t errors at all. But am I grateful it catches the mistakes it does? Of course.

Once you start asking ChatGPT to fix your mistakes, you might consider other ways in which an AI copy editor could serve as a useful input for a journalist. On a lark, I started pasting in the text of my column and told ChatGPT to “poke holes in my argument.” It’s not as good at this as it is catching spelling mistakes. But sometimes it does point out something useful: you introduced this one idea and never returned to it, for example.

Before today, I was essentially hacking ChatGPT into becoming a copy editor. As of today, it’s a GPT (called Copy Editor; it’s currently only available to users in the GPT beta).

To create it, I didn’t write a line of code. Instead, OpenAI’s chat interface asked me what I wanted to build, and then built it for me in a few seconds. Then, I went into its configuration tool to capture the kind of editor I want for Platformer

The GPT builder. GPT-4 wrote most of those instructions; I edited the first sentence but the second two are presented here mostly as the AI wrote them. It generated that logo based on the instructions, too.
The GPT builder. GPT-4 wrote most of those instructions; I edited the first sentence but the second two are presented here mostly as the AI wrote them. It generated that logo based on the instructions, too.

From there, I added a few prompts. I started with the basics (“Identify potential spelling and grammatical errors in today's column,” “poke holes in my argument.”) These become buttons inside ChatGPT; now, when I load Copy Editor, I just click one button, paste in my draft, and let it work. 

Later, it occurred to me that I could try to simulate the responses of various readers to what I wrote. “Critique my argument from the standpoint of someone who works in tech and believes that the tech press is too closed-minded and cynical” is now a prompt in Copy Editor. So is “Critique my argument from the standpoint of an underrepresented minority whose voice is often left out of tech policy and product discussions.”

For the moment, these prompts only perform middlingly well. But I imagine that both I and OpenAI will tune them over time. In the not-too-distant future, I think, writers will be able to get decent answers to the question most of us ask somewhat anxiously in the moments before hitting publish: how are people going to react to this?  

III.

Copy Editor isn’t really what AI developers would call an “agent.” It’s not interacting with third-party APIs to take actions on my behalf. 

You can imagine a more agent-like version embedded into the text editor of your choice; able to transfer the text to the content management system; charged with drafting social media posts, publishing them, and providing you with a daily report on their performance. All of this is coming. 

Many of the most pressing concerns around AI safety will come with these features, whenever they arrive. The fear is that when you tell AI systems to do things on your behalf, they might accomplish them via harmful means. This is the fear embedded in the famous paperclip problem, and while that remains an outlandish worst-case scenario, other potential harms are much more plausible.

Once you start enabling agents like the ones OpenAI pointed toward today, you start building the path toward sophisticated algorithms manipulating the stock market; highly personalized and effective phishing attacks; discrimination and privacy violations based on automations connected to facial recognition; and all the unintended (and currently unimaginable) consequences of infinite AIs colliding on the internet.

That same Copy Editor I described above might be able in the future to automate the creation of a series of blogs, publish original columns on them every day, and promote them on social networks via an established daily budget, all working toward the overall goal of undermining support for Ukraine.

The degree to which any of this happens depends on how companies like OpenAI develop, test, and release their products, and the degree to which their policy teams are empowered to flag and mitigate potential harms beforehand. 

After the event, I asked Altman how he was thinking about agents in general. Which actions is OpenAI comfortable letting GPT-4 take on the internet today, and which does the company not want to touch?  

Altman’s answer is that, at least for now, the company wants to keep it simple. Clear, direct actions are OK; anything that involves high-level planning isn’t.

For most of his keynote address, Altman avoided making lofty promises about the future of AI, instead focusing on the day-to-day utility of the updates that his company was announcing. In the final minutes of his talk, though, he outlined a loftier vision.

“We believe that AI will be about individual empowerment and agency at a scale we've never seen before,” Altman said, “And that will elevate humanity to a scale that we've never seen before, either. We'll be able to do more, to create more, and to have more. As intelligence is integrated everywhere, we will all have superpowers on demand.”

Superpowers are great when you put them into the hands of heroes. They can be great in the hands of ordinary people, too. But the more that AI developers work to enable all-purpose agents, the more certain it is that they’ll be placing superpowers into the hands of super villains. As these roadmaps continue to come into focus, here’s hoping the downside risks have the world’s full attention.

Correction, Nov. 8: This article originally said that a message was sent in Snap. It was sent in Slack.


Give your startup an advantage with Mercury Raise.

Mercury lays the groundwork to make your startup ambitions real with banking* and credit cards designed for your journey. But we don’t stop there. Mercury goes beyond banking to give startups the resources, network, and knowledge needed to succeed. 

Mercury Raise is a comprehensive founder success platform built to remove roadblocks that often slow startups down. 

Eager to fundraise? Get personalized intros to active investors. Craving the company and knowledge of fellow founders? Join a community to exchange advice and support. Struggling to take your company to the next stage? Tune in to unfiltered discussions with industry experts for tactical insights.

With Mercury Raise, you have one platform to fundraise, network, and get answers, so you never have to go it alone.

*Mercury is a financial technology company, not a bank. Banking services provided by Choice Financial Group and Evolve Bank & Trust®; Members FDIC.

Platformer has been a Mercury customer since 2020. This sponsorship has now us 5% closer to our goal of hiring a reporter in 2024.


Governing


Industry


Those good posts

For more good posts every day, follow Casey’s Instagram stories.

(Link)

(Link)

(Link)


Talk to us

Send us tips, comments, questions, and GPTs: casey@platformer.news and zoe@platformer.news.