Entrepreneur, Investor, Bestselling Author & founder of Play Labs @ MIT
This Essay Was Written by a Human, Not a Robot. Or Was It?
Recently, the Guardian, one of the UK’s most popular outlets, released an op ed with a provocative title: “A robot wrote this entire article. Are you scared yet, human?”. Overall, the essay held together unexpectedly well, despite some simple language and repetition, giving it an eerie self referential quality– an AI telling us why we shouldn’t be afraid of AI.
The essay wasn’t created by a robot per se, but by a new piece of software called GPT-3, a text generation AI engine created by San Francisco based Open AI. Not only has the new release raised eyebrows (MIT’s Technology Review called it “shockingly good”) but it has re-surfaced a question that has been explored in popular fiction starting with Mary Shelley’s Frankenstein in the nineteenth century all the way up to modern sci fi classics like Blade Runner and more recently, HBO’s Westworld, where robots that are indistinguishable from humans escape from their sheltered theme park world that they were created for, causing havoc.
Human or robot? This question has become a popular past-time for sci-fi enthusiasts, with Westworld from HBO being the latest
Guessing whether something is artificially generated or not has a storied history, going back to British mathematician Alan Turing at the dawn of the modern computing era in the 1950s. He proposed a parlor game, which he called the “Imitation Game” (note that a biopic of the same name had less to do with AI and more to do with this code-breaking work during WWII). Today we refer to it as the “Turing test”. It involved sending messages to entities behind two curtains — one hiding a computer and one a human — and reading their responses. If you can’t tell which curtain hides the human vs. the computer solely on the basis of the text messages, the we would say that the computer has passed the Turing test.
Previous attempts by AI to write text usually only succeeded on a very small scale — after a few sentences it became pretty obvious that the “author” wasn’t really understanding the content of what they were writing.
Eliza was one of the first textual conversation tools that made waves at the MIT AI Lab
One of the earliest examples that made waves was Eliza, a natural language processing “digital psychiatrist” built at the MIT AI lab in the 1960s. Eliza would ask the user questions and then use some clever pattern matching on the user’s replies to make statements and ask more questions, not unlike a real therapist. For example, you might say “I am going to the see my mother” and it might respond “How do you feel about your mother?”, which might lead to a whole conversation. Of course, if you had more than one short conversation or a long conversation you could quickly realize that the questions were all formulated in the same way … revealing the underlying pattern matching algorithm one question at a time.
More recently, GPT-3’s predecessor, GPT-2, was used by an engineer to build a text adventure game, ala the old Zork games. Content generated by AI would seem to enable endless adventures, but when you played it, while the individual text snippets held up well, they didn’t coalesce together and it became obvious that there wasn’t much correlation between what you read previously and what you were reading in the next “room” of the adventure. It was like a novel with different paragraphs written by different people. Still, it was significant progress over previous attempts.
GPT-3, on the other hand, goes beyond simple pattern matching, and has been trained using modern machine learning techniques on billions of snippets of text. The way it works is that you give it a prompt of any length, and it generates a variable length response — even up to a whole essay — selecting and combining the “best” bits of text it can which seem most relevant and putting them together in a way that would seem most “optimal”. The definition of “optimal” is of course the whole point of training the neural network — and this latest incarnation does a better job at placing snippets of text near each other that seem to “fit”. The longer the prompt, the better the “fit” is likely to be.
Nevertheless, the initial alarm that the Turing test had been passed may have been unfounded. It is the latest in a series of AI advances that have cause speculation about whether the Turing test has been passed. Upon closer examination of the Guardian’s methods, it was revealed that they generated 8 different essays and selected the best parts of each and then edited it, just like a human op-ed would have been edited for style, continuity, etc. This makes it even more suspect to many watchers of AI that a full essay generated by the engine would hold up together.
Technology news today is laden with AI-related announcements, and speculation about passing the Turing Test now goes beyond simple text messages (which Turing envisioned would be done via teletype machines in his original version) to voice and even video.
In 2018, there was similar sentiment about Google’s Duplex having passed the Turing Test. Using its voice generation capabilities, Google was able to make automated phone calls to hair salons, for example, to make appointments for you. The vocal articulation was so good, including fillers like “umm” and “ahh” that it was difficult for the salons to know that it was a robot and not a human on the other end. In the end, Google agreed to state at the beginning of the call that it was being made by a computer and not a human. Was this a passing of a “vocal” Turing Test? There is some debate over this, since it didn’t involve a general purpose conversation, just a very narrow conversation for a specific purpose (to make an appointment). But it did go a long way to show that computers were well on the path to becoming indistinguishable from humans in multiple ways.
Surely, most of us think, if we did away with the curtains then we could tell visually which was the human and which was the AI, correct? Well maybe not as virtual characters become more and more realistic.
The virtual model Mia created by Christian Guernelli
In 2018, Chinese news organization released its two virtual news anchors — that looked, well, pretty much human while reading the news, including facial expressions, verbal flaws and gestures. More recently, virtual influencers, like Li’l Mikaela and others, generated using the same techniques used in video games, have millions of followers, waiting for their videos and images on Instagram and YouTube. In fact, the LA Times recently wrote about the emergence of new type of modeling agencies, where there are no “organic” supermodels, and posits why the next Gigi, Kaia and Kendall just might be digital.
All of these virtual characters are created using the same CGI techniques used in video games and by Hollywood to create special effects. While the latest virtual characters are becoming harder and harder to distinguish, in all of these cases — newscasters, virtual influencers, digital supermodels and CGI characters in films — they all follow a pre-determined script. This means that even if the visual fidelity gets better, as we expect it to in the next few years, even if we aren’t able to distinguish between virtual and real people, thy technically won’t be passing the Turing test because that would require interactivity.
We’re not there yet, but we are getting closer. Each year’s advances advances us along the path of passing the Turing test — textually, vocally, visually and perhaps even eventually, physically. A physical Turing test would be when you couldn’t distinguish between an actual physical robot and a human … no curtains needed, and you might say that that the replicants of Blade Runner and the “hosts” in Westworld both pass this kind of “physical” Turing test.
Since text was the “original” format of Turing’s famous imitation game, GPT-3 is a big step in this direction, as is the Guardian’s op-ed (even if there was some editing required afterwards). It’s possible that there may be more stories coming out soon which will appear to have been written by a human but were, in actuality, written by a robot.
Thank goodness, you must be thinking, that this article was obviously written by a human.
But, then again, given what I have just been telling you, can you be completely sure?
Rizwan Virk is a venture capitalist, founder of Play Labs @ MIT the author of “The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics Agree We Are in a Video Game.” Follow him via his website at www.zenentrepreneur.com or on Twitter @rizstanford.