On the Eve of New Turing Honors, is his Famous Test Still Relevant?

845142_Bitvore images_1_092820-1

Computer scientists rejoiced on March 25th, 2021, when they learned that Alan Turing, a pioneer in the field of cryptography and computing technology, would be honored by appearing on the British 50-pound note.

Turing’s contributions to the field were enormous, even path-breaking. His signature achievement was the creation of the Bombe, a mechanical calculating device that decoded transmissions from Germany’s encrypted radio network, better known as Enigma. Using the Bombe, Allied forces were able to anticipate the movements of the Nazi war machine, a breakthrough that may have shortened the Second World War by up to four years.

 

In addition to this landmark accomplishment, Turing is known for a number of milestones in computer technology. For example, the Turing machine is an early mathematical model of a computing system that was used to inform real-life computers once they became viable. Turing machines are still widely studied—and constructed for fun in software programs and card games—but another idea of Turing’s may be slowly falling out of date.

 

The Turing Test—and its Critiques

 

The Turing test might be one of Alan Turing’s most well-known yet paradoxically least successful ideas. Back in 1950, as the world of computing was beginning to advance beyond the realm of theory, people began to ask if computers would ever be able to think as well as a human.

 

Since “thought” is an abstract comment that’s very difficult to define, even when humans do it, Alan Turing instead proposed a test: 

 

First, place a human observer inside a room with an intercom. In a separate room, place a man, a woman, and a computer. The human asks questions of each participant and tries to guess which one is a machine based on how their answers are constructed. If it’s impossible to guess, then the machine can think as well as a human. 

 

The Turing test has fascinated (and enraged) computer scientists since its inception. In fact, the Chinese Room thought experiment, which we’ve already covered on this blog, was created in direct response to the Turing test. 

 

Without going over it in its entirety, the Chinese room thought experiment argues that it’s possible to create a computer program that perfectly mimics human tone and expression when questioned—but that the resulting computer program would be demonstrably unable to think or achieve human-level intelligence. That’s because the computer in this scenario only acts as a sort of reflex machine, spitting out response X when it hears input Y. Instead of an intelligent supercomputer, you’re instead dealing with a sort of parrot.

 

Given that the Turing test faced strong and immediate criticism as soon as it was debuted, how has it fared in modernity?

 

The Turing Test in 2021—Passing or Failing?

 

Ever since the Turing test debuted, people have been creating computers that have tried to pass it. One of the first of these was ELIZA, a simple computer program designed to simulate a patient’s interaction with a therapist. Although Eliza only had a few different outputs—and would mostly parrot rephrased versions of the patient’s responses back to them (“How are you today?” “I’m sad.” “Why do you feel sad?” etc.). Although ELIZA was a very simple program that quickly began repeating itself, those it interacted with often ascribed it human emotions and intentions that it could not possibly have had.

 

Since ELIZA, many other computer scientists have tried their hand. There’s even an annual contest where makers of these programs try fooling human judges—and in 2013, one of them succeeded, convincing a panel of Royal Society judges that they were talking to a 13-year-old Ukrainian boy instead of a machine. This success, however, comes with an asterisk: the test only fooled 33% of the judges, and it did so by convincing them that it spoke English as a second language.

 

Right now, natural language AI systems are commonplace—you run into them on the phone, in tech support forums, and when booking a reservation at a restaurant. Many of them are advanced enough that they could fool a disinterested human, but we would never call what they do “thinking.” In this sense, it’s safe to say that the Turing test fails—it is possible to create a machine that fools humans into believing it’s a human, but that the ability to mimic speech isn’t correlated with human intelligence.

 

Turing may have been onto something nonetheless. GPT-3, a generative text program, represents a new approach to machine learning known as unsupervised learning. It’s fed a vast unlabeled corpus of text and then, when given a prompt, predicts the words that should come next. It’s scarily good. Lead researcher Gwern Branwen says that “past a certain point, that [improvement at prediction] starts coming from logic and reasoning and what looks entirely too much like thinking.” 

 

In other words, fooling a human doesn’t mean that a computer has human-level intelligence. But the more we train computers on human language, the more their intelligence may increase.

 

Download