What is a model in Machine Learning?
What is a model in Machine Learning?
Data scientists and Machine Learning (ML) software developers constantly use the term ’model’. They take this term for granted, as if its meaning is obvious and does not require explanation. They say “train a model”, “build a model”, “create a model”, “predictive model”, “classification models”, and many similar terms, without considering that a novice, or a person who is new to this particular field, won’t know the meaning of the word ‘model’ in this context. There are often cases where a developer using machine learning and artificial intelligence algorithms cannot answer this seemingly simple question, what is a model?
Let's try to clarify a model for Machine Learning
Do concepts like a model of the solar system or a model of the pyramid of Giza cause difficulties for you? It is likely that there will be no difficulties with these concepts. You intuitively perceive that, in this case, the model is a kind of schematic image or figure, which is somewhat similar to a real object.
In other words, a model is a kind of simplified version of a real object.
And this intuitive view of the term ‘model’ is correct. The reality is very complex; it depends on an innumerable set of known and unknown parameters, on random events which we can’t make predictions about. A model is a formal way to simplify reality into a state we can work with.
If we talk about AI predictions, the model should simultaneously be accurate enough to produce relevant predictions, while not being too complex for our computational capabilities to be sufficient for its application. Thus, the complexity of the model is always a compromise between accuracy and performance.
Model Selection in Machine Learning
Let's suppose, we are faced with the task of predicting the temperature in Toronto for tomorrow. The easiest way is to assume that the temperature tomorrow will be equal to the temperature today. This is the simplest model that considers only one parameter. Despite the fact that this approach sometimes works, this is not an accurate model. If you add air pressure to this model, the model will become a little more accurate, but increasingly complicated. If you also add the wind direction and temperature in neighboring regions, it will be even more difficult, but more accurate.
You can complicate the model infinitely, making it both more accurate and complex. You can add as much detail to your model as you can think of, from the position of clouds in the sky to the height of the ocean waves. It will be a very accurate model, but there is no computer that can handle such calculations. Obviously, the model to be used is somewhere between the simplest iteration and the most complex.
Suppose we have settled on a model in which the predicted value depends on ten parameters known to us. What is the dependence? How do we find the formula through which using the 10 known parameters produces the desired result? To accomplish this, we use a process called training a model. As an input, machine learning algorithms get a large volume of data, for each of which the correct answer is known. These algorithms find a pattern by using our 10 parameters to ascertain an accurate prediction.
A Summary of the Model in Machine Learning
There are a large number of tools and techniques to assist you in choosing models that are suitable for solving certain problems. At the same time, you need to understand that there is no universal model that can be applied to individual issues you are trying to solve. There are tools and techniques to compare the effectiveness of models. These tools differ in detail but are essentially built to compare prediction performance.
The popular science book, The Grand Design, by Stephen Hawking, has a chapter introducing and describing the concept of model-dependent realism. I would highly recommend reading this section (and even better, the whole book); the concept of models, their advantages and disadvantages as well as comparison criteria are described very well there. Although it is not about machine learning, it is very useful to embrace the concept of a model at a more abstract level.