Sure, AI is really good at diagnosing fractures and targeting drones, but can it do something really hard like, say, write a great ad? Recent advances in massive language models bring them uncannily close. Salesforce strategist Martin Kihn shares the results of an interesting test combining Super Bowl ad copy and an open-source natural language A.I. model called GPT-2. Here’s what he discovered and what it means for the future of copywriting.
When I'm thirsty, I go with water. When I'm hungry, I drink beer.
It wasn't me who made the decision. It was the people on Reddit!
Wow no cow, no beef. This was so good it even tasted like bacon.
Imagine for a moment you are in a creative brainstorm, and a junior copywriter swoops in bravely with the above. You might pause for a moment, inhale, and say, "it's a start."
Now what if I told you that copywriter was a machine who had been given a specific prompt (in italics) based on recent spots from Super Bowl LV? Well, it was a machine.
Those lines – and dozens less sensible – were generated on my MacBook Pro using a pretrained open-source natural language A.I. model called GPT-2, built by the Elon Musk co-founded OpenAI. It was "steered" by a list of words taken from Super Bowl ads using another open source code library called PPLM, built by Uber Engineering.
Loading and learning the models took about an hour. And given a few-word prompt, GPT-2 happily takes about five minutes to churn out 20 "ideas," without breaking for lunch.
Text generation – or robo-writing – has made startling leaps in the past few years, moving from punchline to something that may deserve a seat in the creative lounge. Believe it or not, the best robo-writers are almost the equivalent of that most annoying/wonderful phenomenon: the eager beginner, completely inexhaustible but creatively uneven.
Most of GPT-2's "ideas" were not quite ready to be presented; some were nonsensical. Oddly, it had no clue what to do with the prompt "Jason Alexander." And one of its "Wow no cow" completions was "Wow no cowbell can be quite like the best in the universe."
Which is probably true and not at all helpful.
In the near future, the smartest creative teams will be those that can use A.I. writers in productive ways, as a computer assist to a creative session and a source of ideas that might spark better ones.
One trillion parameters
At first, GPT-2's creators were so afraid of its power falling into the wrong hands that they were reluctant to release it. They relented and rely now on academic partnerships to limit bad actors like cyber-propagandists. Although not open source, GPT-2's successor — called GPT-3 — is available to try on application as an API. The full model was recently licensed to OpenAI's major investor, Microsoft.
GPT-3's largest setting has 175 billion parameters. Think of these as individual knobs that the model has to tune, based on human writing samples, in order to predict the next word in a sentence. Google just open-sourced an even larger text model called Switch Transformer that reportedly has more than 1 trillion parameters.
The human brain has about 3 trillion synapses. Let's leave that right there.
GPT-3 takes GPT-2 out of the sandbox and sends it to middle school. Give the model a prompt (e.g., "Once upon a time..." or "It wasn't me..."), and it can continue at some length, generating text that is often plausible and sometimes uncanny. Early testers ranged from awed to skeptical – and often both.
Hype grew so heated last summer that OpenAI's chief executive Sam Altman took to Twitter to reassure humanity that GPT-3 "still has serious weaknesses and sometimes makes very silly mistakes."
The most angst issued not from writers and poets — who are depressed enough already — but ironically enough from computer programmers. It turns out that computer code is also a language that GPT-3 likes to write.
In fact, that's what makes this new generation of robo-writers different: they are flexible models, verging into the space called artificial general intelligent (AGI). This is the kind of intelligence we have: not pre-trained in any particular discipline but capable of learning. GPT-3 seems to perform well on a range of language tasks, from translation to chatting to impressing electronic gearheads.
Ad copywriting isn't such a leap. As a tool to build creative prompts from catch phrases ready for human filtration, so-called Transformer models make a lot of sense.
From the refrigerator to the gallery
Even as AI agents got noticeably better at diagnosing fractures and targeting drones, their creative efforts were conspicuously weak. Robot "art" looked like it belonged on a refrigerator in a garage, and robot "poetry" not even on a prank e-card. This is changing.
Robo-writers are already employed in shrinking newsrooms. So far, they're mostly stringers on the high-school sports reporting, weather and stock market desks — churning out endless Mad Lib-style pieces in routine formats about games and finance that no sane journalist would want to write, even for money.
Computers thrive at tedium. It's their métier. The for-profit OpenAI takes a brute force approach to its innovation. Two years ago, it gained notoriety for developing a machine that could beat the world's best players of a video game called Dota 2. It did this by having software agents play the equivalent of 45,000 hours of games, learning by trial-and-error.
The GPT family of tools were also developed by pointing software agents at a massive corpus of data: in GPT-3's case, millions of documents on the open web, including Wikipedia, and libraries of self-published books.
GPT-3 is exposed to this mass of human-written text and builds a vocabulary of 50,000 words. Its model’s weights predict the next word in a sequence given the words that came before – and develops meta-learning beyond simply memorization. It requires a prompt and can be guided by sample text that provides a "context," per the Super Bowl examples above.
It's a trivial matter to drop a prompt like "Car insurance is ..." into the GPT-3 Playground, tweak a few toggles, and generate snippets of sensible prose. It's not much harder to guide the model with a sampling of, say, action movie and comic book plots and generate stories at least as coherent as those of some recent superhero movies.
To answer the obvious question, it can be shown that GPT-3's creations are not just plagiarism. But are they pastiche? The model learns to predict words based on its experience of what others have written, so its prose is predictable by design. But then again, isn't most writers?
Limitations include a "small context window" – after about 800 words or so, it forgets what came before – so it’s better at short pieces. It has a short attention span, matching our own.
For this reason, people who have spent more time with the model grow less impressed. As one said: "As one reads more and more GPT-3 examples, especially long passages of text, some initial enthusiasm is bound to fade. GPT-3 over long stretches tends to lose the plot, as they say."
But ad copywriting isn't about sustaining an argument or writing a book. It's about trial and error and inspiration. Increasingly, that inspiration may be coming from a robot.
Martin Kihn is senior vice-president strategy at Salesforce