What is GPT-3 Actually Doing?

1033310_Bitvore images_7_041321

At the moment, GPT-3 looks like one of the most important innovations in artificial intelligence. It is a scarily good speech generation algorithm, able to compose entire essays with just a single-sentence prompt (which is something that can be hard even for an ordinary human being, as this writer will tell you). Lead researcher Gwern Branwen is even quoted as saying that this adept level of speech represents an essential precursor to what looks like intelligent thought.

In other words, if an intelligent human-level AI ever exists in the future, it is likely to claim GPT-3 as one of its distant ancestors. This is very exciting, and we don’t want to throw cold water on anyone’s dreams. With that said, the distance between GPT-3 and an actual, human-level artificial intelligence is still vast.


Understanding the Gulf Between GPT-3 and Thought


To understand the difference between GPT-3 and human thought, you should understand how GPT-3 works. Let’s start with the name: Generative Pre-trained Transformer. Generative means that it generates text. Pre-trained means that it has been trained using a specific kind of ML training known as unsupervised learning.


Unsupervised learning is distinct from supervised learning. Suppose you want to train your algorithm to create a sonnet using supervised learning. To do that, you must first show your algorithm a bunch of carefully labeled examples of sonnets and then prune its output until it creates only sonnets.


GPT-3, which is based on unsupervised learning,  has never been told what a sonnet is, but if you ask it to create a sonnet, it can just do that—sort of. GPT-3 has never been specifically trained to create sonnets. Still, it has an idea of what they are because it’s been fed a vast and uncategorized corpus of the human language (that included sonnets), and it is able to figure out what they are via inference. This is what’s meant by unsupervised learning. Because GPT-3 has been exposed to so many writing examples, it can predict the words that should come next when given a prompt. That’s all it does.


What are the Limitations of GPT-3?


If you work with GPT-3 long enough, the seams begin to show. The first thing you need to understand is GPT-3 always needs a prompt of some kind to get started, and the prompt needs to be fairly detailed. For the sonnet example above, the researcher gave the model three starting lines:


A sonnet by Marcus Christian:

 I ply with all the cunning of my art

 This little thing, and with consummate care”


These lines contain a lot of information—the fact that the algorithm needs to write a sonnet, the author of a sonnet (so that the algorithm could imitate his style), and two opening lines. In short, GPT-3 needs a relatively large amount of data before it can get going.


Second, GPT-3 works by predicting the words that should come next once given a prompt. The longer the writing sample it generates, the more likely that the prediction is incorrect. Once the program makes one wrong guess, the next guess is also based on an incorrect guess, and the algorithm’s monograph devolves into nonsense. In fact, GPT-3 comes with a built-in limit on how many words it can produce to make it less likely that it will veer off-topic.


Lastly, GPT-3 falls victim to an ugly problem that’s common to many unsupervised AI-learning projects. When exposed to a training corpus that includes the unfiltered internet, the algorithm incorporates examples of racist, hateful, and biased texts.


Where Does GPT-3 Go from Here?


Although GPT-3 is still more of a gimmick than a thinking machine, the real question is what happens next. The fact of the matter is that GPT-3 hasn’t been trained using any particularly innovative techniques; it uses an out-of-date model and has a small memory window—it doesn’t remember much of what it’s already written.


In other words, there’s a huge potential to improve GPT-3 by giving it more computing power, a better architecture, and more memory. At that point, what happens to the flaws in GPT-3? Do they iron themselves out? Does the algorithm get exponentially smarter? Do you hit a law of diminishing returns?


The next few years are going to be very interesting as we wait to find out.


Want to learn more about how Bitvore is contributing its own innovations to the field of artificial intelligence? Download our most recent case study below!