Let's start by recapping one of the most famous thought experiments in the history of computer science. Imagine yourself in a locked room. Periodically, someone slips a list of Chinese characters under the door. You don't speak Chinese, but you have a list of instructions: something like, if you see character X, write character Y in response unless it's next to character Z. Using these instructions, you can compose appropriate Chinese phrases and slip them back out under the door, thus fooling your correspondent into believing that there's a fluent Chinese speaker inside.
The sum of the argument, known as the Chinese Room, is this: it is possible to create an advanced computer system that provides intelligent responses to complicated questions, but it is much harder, if not impossible, for the system in question to understand the question and the responses that are being provided. In short, instead of creating linguistic savants with software, we can only program advanced mimics.
To date, no advanced AI systems have been able to disprove this argument, which has led to some difficulties in the practical application of AI.
Why is it Important that AI Understands its Own Output?
There are three problems, as we see it, with practical applications of AI that follow this sort of Chinese Room model.
First, the output of the AI may be comprehensible to a native language speaker, but it doesn't prevent the AI from answering the question incorrectly. For example, if you ask your Google Home if it's safe to feed bell peppers to your dog, it's going to give you an answer you can understand, but that answer isn't necessarily correct – it's just whatever the top Google Search result happens to be. If the top search result for "can I feed my dog bell peppers?" isn't a reputable source, then you might end up with an expensive vet visit.
This leads us to the second problem: people tend to over-trust AI's capability, especially when it can respond to their natural language input. For example, just look at the large number of people arrested for driving their Teslas on Autopilot while either sleeping or drunk. Although Autopilot is just partially autonomous, alarming numbers of people tend to take artificial intelligence offerings exactly at face value.
Lastly, the Chinese Room, like many AI systems presently in use, is a closed system. When these systems make mistakes, it can be difficult to diagnose exactly how the mistake got made. If AI generates output based on biased or incomplete training data, it can take months to detect and mitigate the error.
Can We Solve the Chinese Room by Adding Common Sense to AI?
Let's assume that there's no way, as of yet, to allow AI to comprehend its input and output. Given this constraint, can we still solve problems with practical AI? Per the section above, this would involve giving AI the ability to interrogate the answers to its questions, contextualize its responses so that users don't over-trust the output, and provide transparency into its decision-making process.
Some efforts to fix these problems include giving AI a common sense understanding of the world. This is to say that the AI should have an understanding, below the level of sentience, of the relationships between objects and information that exist in the world. Common sense is difficult to define but is often described as a list of things that an average human being learns before turning four years old.
Using the power of common sense, natural language AIs might be able to:
- Answer a large variety of simple questions without using external references
- Pick the most accurate source if an external reference is required
- Navigate both language and the physical world in a more trustworthy manner
- Recognize and respond when a user over-trusts the AI's capability
- Quickly provide its reasoning when asked how it came up with an answer
Right now, even the most advanced AI is barely nearing the border of common sense.
Previous approaches to common sense in AI have taken two approaches. In one, researchers create a list of facts about the world, and the AI relies upon it as a sort of database. This results in a more accurate AI, but the database approach doesn't scale—there are too many facts for researchers to write down everything that AI needs to know.
The second approach involves a neural network. Trained on a large corpus of language, programs like GPT-2 can piece together surprisingly readable text. Their output can be inconsistent, however, and the network often misses commonsense inferences. Therefore, although neural networks are scalable, they can be much less accurate.
Lastly, there's a new approach called COMET, which combines a small database of known facts with a neural network that tries to extrapolate these facts into conversational language. This approach has promising results—almost 80% of its responses were understood as intelligible by human evaluators.
COMET isn't yet ready to display most of the advantages of common sense—but it's starting to become good enough. Greg Bolcer, Chief Data Officer at Bitvore, says, "Common sense is very important to our organization. Customers want to know that there's a common sense reason that our product decided that a topic has financial impacts on a state, city, or company. Instead of having a perfect answer every time, which is nearly impossible, providing a "good enough" answer and the common sense reasoning behind it transforms AI into a tool that's more suitable for everyday use."
In other words, AI isn't yet good enough to understand the language that it's producing—and it might never be good enough to do that. By accepting this limitation and substituting common sense for sentience, we end up with AI products that can still make huge impacts in our lives.
Want to learn more about Bitvore? Download our latest white paper: Using Sentiment Analysis on Unstructured Data to Identify Emerging Risk.