Imagine you've been researching a new machine learning tool. You believe it will help your organization and are impressed by the vendor's claims. You sit down for a meeting with their sales manager, eager to find out how it works and what it can do for you.
The sales rep understands your needs and is happy to explain how the software can help. But when you ask how the software comes to its conclusions, the answer seems vague. Sales tells you that the software analyzes hundreds of different factors in the data to reach its conclusions. You ask some more questions but get the same answer: it weighs hundreds of different factors. Is the sales rep being evasive? Do they not understand it themselves?
Many ML models are known as "black boxes." Data flows in, and information or predictions flow out, but what's happening inside remains obscure. You can evaluate how good the model is based on whether the information flowing out accurately matches real-world conditions, but you can't see how the software produces that information.
To understand why this is, we need to talk about how ML modeling works.
As Sergey Mastitsky lays out in an excellent article explaining machine learning, there are two ML and AI modeling types: data modeling and algorithmic modeling. Data modeling is a stochastic (i.e., probability-driven) model; it uses a relatively small set of factors to predict an event's likelihood, so it's relatively easy to understand how it reaches decisions.
Algorithmic modeling uses an arbitrarily complex method (essentially, a method that can be as complex as it needs to be) to model the real world. Algorithmic models can employ any strong correlations they find in the data to predict future events, even if those correlations have no logical connection to the output. If the model can predict events correctly, it's a good model, even if it can't explain why those events are happening.
Both methods have strengths and weaknesses. Stochastic modeling is simple and easy to understand and interpret — you get the "why" and along with the "what?" — but it's usually not very accurate. Algorithmic modeling can be much more accurate because there are no limits placed on its complexity. But algorithms are often black boxes, and even when you can understand how they make predictions, those predictions might not tell you anything about how the factors they look at are related in the real world.
To illustrate how algorithms can make valid conclusions in unexpected ways, Mastitsky uses an example from Tyler Vingen's hilarious collection of spurious correlations. Some time ago, it was noticed that per capita mozzarella cheese consumption correlates with the number of civil engineering doctorate degrees awarded in the USA. As mozzarella cheese consumption goes up, so does the number of CE doctorates.
What does this correlation mean? Most like, absolutely nothing. Mozzarella isn't making people study civil engineering, and CE isn't making people eat mozzarella. They're just two numbers that happen to line up for reasons we don't really understand. However — and here's the important part — if you had a lot of data about cheese consumption, you could still use it to help predict CE doctorates, even though the two things have nothing to do with each other.
The goal of algorithmic modeling isn't to explain nature but to predict it. The real world is incredibly complex, with an unlimited number of factors interacting in ways we can't fully track. Algorithmic models are helpful precisely because they can model systems we don't completely understand.
Like human decision-makers, algorithmic models are always working with incomplete information. To make up for this, they look at previous patterns to decide on the best course. What's important is not how the decision-maker arrives at a decision, but whether it's the right decision.
And just as two people can come to the same decision for entirely different reasons, two algorithmic models can use completely different factors and make correct predictions. In statistics, this is called "multiplicity of models."
In most business decisions, the "why?" isn't nearly as important as the "what?" For example, if you're doing supply chain logistics, it's very useful to predict changes in demand so you can allocate supplies accordingly. Why demand is changing doesn't matter to your use case, so long as you can make accurate predictions, or rather it only matters to the extent that it can help you make better predictions and decisions.
However, there are situations where you do need to understand the "why?" For example, if your clients start switching over to a competitor, you need to know why, so you can earn those clients back.
Algorithmic models can still be useful in these situations because they can be designed to look for factors human analysts know are relevant. For example, algorithms can look through reviews, articles, and comments on your new product and tell you what sentiments users are expressing, track pricing, company reputation, and other factors that influence consumer decisions.
Additionally, developers and data scientists are getting better at designing models that aren't black boxes, using tools that expose their inner workings to stakeholders. Over time, Interpretable machine learning tools like LIME (Local Interpretable Model-Agnostic Explanations) will make it easier for experts and non-experts alike to observe sophisticated ML models' inner workings.
Business has always been full of unknowns. Every manager and worker at your organization has their own unique way of thinking and making decisions. You learn to trust them by looking at their track record, talking to people who know them, and working with them yourself. An ML model is just another kind of entity working for your business. And as more business executives learn how helpful and dependable these entities can be, it will become easier to trust them — even if you don't fully understand their inner workings.
Want to learn more about Bitvore? Download our latest white paper: Bitvore Cellenus Sentiment Scoring.